Suggestions on the documentation in part generating traning data for ssd

DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

MIT License

339 stars 97 forks source link

Suggestions on the documentation in part generating traning data for ssd #18

Closed jinjuehui closed 5 years ago

jinjuehui commented 5 years ago

Thanks a lot for the nice work, but there's some information missing related to generating training data for ssd, hope followings will save time for others who are also interested in this work:

for generating training data for ssd, instead using "detection_utils/generate_sixd_train.py", "generate_syn_det_train.py" should be used. And the commanline configuration looks as follows:

python generate_syn_det_train.py \
     --output_path=/path/to/training_data_ssd  \  <==output dir given by user
     --model=/path/to/model_dir/t-less/t-less_v2/models_cad/  \ <==model data dir
     --num=3000 \ <==how many data u want to generate
      --scale=1 \ <==scale factors, In case of a model in meters, use 1000
      --vocpath=/localhome/demo/autoencoder_6d_pose_estimation/backgrounimage/VOCdevkit/VOC2012/JPEGImages  \ <==background pictures location
      --model_type=cad  <==when with textures use 'reconst'

after running, xml annotations and training images will be generated nicely.

MartinSmeyer commented 5 years ago

Thank you! I should add some documentation to it.

Both scripts should work, "generate_syn_det_train.py" creates synthetic object views from the 3D models and "detection_utils/generate_sixd_train.py" uses a SIXD training set (e.g. from T-LESS) with single object views on black background. As stated in the paper, for T-LESS we use the primesense training set. Just change the paths inside detection_utils/generate_sixd_train.py to point to the T-LESS and VOC datasets and you should be fine.

wangg12 commented 5 years ago

@MartinSmeyer What should I do to train a detector on LINEMOD? I tried detection_utils/generate_sixd_train.py using the "train" images with single object views on black background from the SIXD challenge, and then trained a Yolov3 detector, however it failed to detect objects in real images from the testing test.

Do I need to make additional changes to prepare the training images for LINEMOD?

MartinSmeyer commented 5 years ago

Yes, for LineMOD you have to additionally render views from the provided models using the generate_syn_det_train.py with random light sources, reflectance properties (phong model) and color augmentations. We also froze the first layers of the detector to not overfit to the synthetic data. You can try that with Yolov3 too.

wangg12 commented 5 years ago

@MartinSmeyer Could you share the script arguments for generating the syn_det data for LINDMOD? BTW: how many layers do you freeze when you trained the detectors?

I assume that it is possible to successfully train a detector without using real data provided by LineMOD, right?

MartinSmeyer commented 5 years ago

Yes, it is! Of course, results are weaker than training with real images. For more insights and quantitative evaluation on object detection with synthetic data I recommend e.g. this paper: https://arxiv.org/pdf/1902.03334v1.pdf I checked and for RetinaNet we actually froze the whole backbone, there is an option for it in the open source code. For the dataset generation, to be honest, I don't have all the parameters that we used for the final training as we changed the code quite frequently and it was not the focus of the paper. But please pull the latest version and try the default using this command:

python generate_syn_det_train.py \
     --output_path=/path/to/det_train_data  \ 
     --model=/path/to/LineMOD/models/  \
     --num=60000 \
      --scale=1 \
      --vocpath=/path/to/VOCdevkit/VOC2012/JPEGImages  \ 
      --model_type=reconst

Exclude the bowl and the cup from the training models.

wangg12 commented 5 years ago

@MartinSmeyer Another question, for training the detector of LineMOD, do you use the data generated by both generate_syn_det_train.py and generate_sixd_train.py or only generate_syn_det_train.py?

MartinSmeyer commented 5 years ago

Additionally, so both datasets. Also, for training you might need early stopping to avoid overfitting to the synthetic data.