keeganhk / PointMCD

PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal Distillation for 3D Shape Recognition
48 stars 10 forks source link

If only use mvcnn,will the result be better? #2

Closed bigKoki closed 10 months ago

bigKoki commented 10 months ago

firstly,your work is very creative!!! But i have some question.In 3D tasks,multi-view methods are better than other methods.In your work,the teacher network boost the student network,if i only use trained teacher network for testing,will the result be better than student network? thanks!!!!

keeganhk commented 10 months ago

Hi, thanks for your interest in this work. Actually, the great performance of multi-view image-based learning models relies on the quality of the rendered images. If the image is rendered from mesh models, the teacher shows much better performances on tasks like classification/retrieval. If there is no available mesh models and thus the image can only be rendered from raw points, then the teacher's performance is worsen than point-based networks. However, as validated in our work, in this setting, even if the teacher is weak, the distillation process can still boost the student point-based networks.

bigKoki commented 10 months ago

您好,感谢您对这项工作的关注。实际上,基于多视图图像的学习模型的出色性能取决于渲染图像的质量。如果图像是从网格模型渲染的,则教师在分类/检索等任务上表现出更好的表现。如果没有可用的网格模型,因此只能从原始点渲染图像,那么教师的性能会比基于点的网络更差。然而,正如我们的工作所验证的那样,在这种情况下,即使教师很弱,蒸馏过程仍然可以促进学生基于点的网络。

sorry for the late reply,thanks for your explain!!!