Closed meijie0401 closed 3 years ago
Hi,
The sfm data was computed using the MATLAB scripts from CMR. You can find the scripts in that repo. You'll also need to infer the masks using Mask R-CNN (they have a script for processing the detections).
Just a quick tip: to my knowledge, the number of images in the airplane class of P3D+ is pretty small, so you might want to add extra images from ImageNet or a similar dataset if you want to get better results.
Thanks for the reply! Based on your instruction, I downloaded the data directly from CMR. Their total number of data is like following: p3d_sfm_image/img_anno/car_val.mat 219 p3d_sfm_image/img_anno/aeroplane_kps.mat 8 p3d_sfm_image/img_anno/aeroplane_val.mat 199 p3d_sfm_image/img_anno/car_train.mat 4975 p3d_sfm_image/img_anno/aeroplane_all.mat 1339 p3d_sfm_image/img_anno/car_kps.mat 12 p3d_sfm_image/img_anno/aeroplane_train.mat 1140 p3d_sfm_image/img_anno/car_all.mat 5194 p3d_sfm_image/sfm_anno/car_val.mat 219 p3d_sfm_image/sfm_anno/aeroplane_val.mat 199 p3d_sfm_image/sfm_anno/car_train.mat 4975 p3d_sfm_image/sfm_anno/aeroplane_all.mat 1339 p3d_sfm_image/sfm_anno/aeroplane_train.mat 1140 p3d_sfm_image/sfm_anno/car_all.mat 5194
I believe their folder 'img_anno' represents your folder 'data' folder, and their 'sfm_anno' represents your 'sfm' folder. However, in your 'sfm' folder, 'car_train.mat' has 4972 annotations, 'car_val.mat' has 218 annotations. However, the above numbers are 4975 and 219 respectively. Why there is a difference? Did you delete 3 training images and 1 val image from CMR or you select 4972 train images and 218 val images totally different from CMR?
When I worked on this project, the only class that was released publicly in CMR was the bird class. For Pascal3D+, I had to run their MATLAB scripts from scratch and prepare the mask segmentations using Mask R-CNN. Looks like they released the other classes in November (a couple months after we uploaded our paper), but I think they are produced using the same script.
Their script performs structure-from-motion (an optimization algorithm that might lead to slightly different results between runs), and they have a couple checks to discard images that are above a certain reprojection error. I think this might explain the difference, that is, running the script results in different sets of images every time. Anyway, my guess is that this only concerns borderline cases. If you check the intersection between the two datasets, they should be very close except for a few images.
Got it, Thanks a lot!
Congratulation on your amazing work! I want to train your mesh reconstruction on aeroplane in Pascal 3D+ instead of car. Could you please share the code which preprocesses the car data in Pascal 3D+ to get all info such as mask, sfm required for your mesh reconstruction training so that I can get all required info of aeroplane?