Question about pretraining in stereo task

ucbdrive / hd3

Code for Hierarchical Discrete Distribution Decomposition for Match Density Estimation (CVPR 2019)

BSD 3-Clause "New" or "Revised" License

204 stars 31 forks source link

Question about pretraining in stereo task #37

Open wmn931201 opened 3 years ago

wmn931201 commented 3 years ago

Thanks for your wonderful work! One thing that bothers me is why you don’t use sceneflow dataset during pre-training in stereo task. The sceneflow dataset has more data than flythings3D subset, and there are monka and driving subset in sceneflow dataset. In theory, using sceneflow dataset has better generalization performance.

Thanks~

yzcjtr commented 3 years ago

Hi, a pretty good question! I agree that pretraining on the SceneFlow dataset can be rewarding. When we did this work, we found many contemporary work in stereo used different training data actually (though such information can not always be found in the paper but in the implementation), which makes the comparison not quite fair. Therefore, we ended up using the minimal data and already found the numbers look good. So we didn't bother to use additional data. As a trade-off, the provided pretrained model might not be quite generalizable. If you have any better model, feel free to contribute :)

wmn931201 commented 3 years ago

Hi, I've tried to pretrain on sceneflow dataset,but the result is very strange, the output result is all zero. In the previous issue, someone have asked the similar questions, but he or she only used the flythings3d dataset in sceneflow. and when reading images, he or she use Image.open(filename).convert(RGB), and I do the same with reading images, because some images in sceneflow is RGBA format which has 4 channels. His or her final conclusion is that there is somthing wrong in dataloader, I haven't found the reason yet. and I am exploring in code.

Thank you very much!

yzcjtr commented 3 years ago

Cool. Can you reference the issue here? Not sure which one exactly you are referring to.

As for the data loader, I don't find any problems when I'm using the datasets I mentioned in the readme file. I think you do need to watch out for it when you handle a new dataset such as SceneFlow. Maybe you could preprocess all the images to get rid of those RGBA format? I would suggest running a few examples and make sure the input images and labels given by the data loader all look good. Tensorboard would be also helpful for checking this.

wmn931201 commented 3 years ago

The issue is https://github.com/ucbdrive/hd3/issues/24. Yes, I am checking dataset now. and RGBA images are the majority of the Sceneflow dataset, so they can't be ignored. Thanks!

wmn931201 commented 3 years ago

Hi，I find the value of annotations is negative in flythings3D_subset, but the value of annotations is positive in Sceneflow dataset. In your code, when reading labels, disp = np.expand_dims(-read_pfm_file(file_name), axis=-1) is used to transform disp to be positive . So I guess that I should use disp = np.expand_dims(read_pfm_file(file_name), axis=-1), ignoring the “-” operation while training sceneflow dataset. Thanks！