j96w / DenseFusion

"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
https://sites.google.com/view/densefusion
MIT License
1.1k stars 300 forks source link

are the camera parameters that the model is trained with baked in? #150

Open huckl3b3rry87 opened 4 years ago

huckl3b3rry87 commented 4 years ago

Hi, I know that the camera parameters can be adjusted to modify the point cloud, depending on the actual camera used.

But, I am wondering are the camera parameters that the model is trained with baked in somehow?

If not, how do you avoid this?

Thank you, Huck

j96w commented 4 years ago

Hi, thanks for mention this very good question. After thinking about it for a few days, I have changed my mind and think the answer is yes (camera para is baked in the model). The main issue is from the RGB channel, which is a CNN architecture. If the testing camera is very different from the one used to build the dataset, especially the focal length, the trained model will have a performance drop during the testing because of the changes of the RGB images (wider angle or distortion). However, since the input of the RGB channel is the an image crop of the target object, I still think the effect of these changes might still be small if the object is not in the corner of the frame. But I have to admit that it will affect the performance in some degree.

I would say it might be difficult for the CNN channel to fix this issue. To avoid this, there are some of my thinkings but I'm not sure whether they will help: (1) Build dataset with different cameras. (Very time consuming, I know...) (2) Generate some distortion changes to the RGB image. Or some spatial transformation might also help. Just regard it as a data-augmentation method to avoid over-fitting to one specific type of camera. (This might worth a try)