UMassLowell-Vision-Group / datasets

Generating Large Scale Image Datasets from 3D CAD Models
BSD 2-Clause "Simplified" License
13 stars 3 forks source link

Source code for dataset generation #1

Open escorciav opened 9 years ago

escorciav commented 9 years ago

Hi,

according to the paper of CVPR15:

we will also provide a software library for dataset generation so that the computer vision community can easily extend or modify the datasets accordingly

When do you plan to release the source code or the library for dataset generation? Could you give us a timeline?

baochens commented 9 years ago

Sorry for the late reply, the source code for dataset One is available here: https://github.com/UMassLowell-Vision-Group/From-Virtual-to-Reality, under the 'code' folder. The source code for dataset Two is similar to dataset One and should be available in a public repo soon.

baochens commented 9 years ago

More detailed comments on the source code and how to reproduce/regenerate the dataset:

Please go to the GitHub repo (Click the 'View on GitHub' button in the above) for source code, 3d Models, datasets, etc. (URL: http://vision.cs.uml.edu/From-Virtual-to-Reality/)

The source code is available in 'code' folder. The render.ms is the file for the rendering part. All the rendered images (with annotation) are in these two folders: 'virtual' and 'virtual_gray'. The 3D models is in the '3d_models' folder. Basically, here are the brief steps:

  1. Download the 3D models
  2. Run render.ms in 3ds Max (the software we used to generate the dataset)

The are also limited comments in render.ms which might help you understand the code.

To generate more realistic images (as in the 'What Do Deep CNNs Learn About Object' paper), you may need to specify different background and texture for different category/3d model. Then you need to change the 'images_bg' and 'images_texture' in the render.ms file to point to different background and textures for different category/3d model.

For now, we did not implemented the full photorealistic rendering since it is more complicated (e.g. might need to use ray-tracking algorithms to do that) and take far more time (e.g. hours) to render one image. However, as showed in the 'From-Virtual-to-Reality' paper, simple domain adaptation techniques can get same performance as training classifiers with real images (e.g. images from ImageNet).

escorciav commented 9 years ago

Thank you for the details :+1: . I will check the paper carefully. It seems pretty interesting. Unfortunately, I won't be in ICCV. I hope that my advisor can reach you and talk with you about your work.

baochens commented 9 years ago

Thanks. Unfortunately, I won't be at ICCV neither. However, my advisor Kate will be there and present our paper titled 'Learning Deep Object Detectors from 3D Models' (extended version of the 'What Do Deep CNNs Learn About Object' paper) which used Dataset Two (the more photorealistic one with real texture/background). I hope they can talk to each other during ICCV. Thanks again for your interest in our work.