zubair-irshad / shapo

Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
Other
181 stars 12 forks source link

Creating custom dataset for shapo #11

Open alihankeleser opened 1 year ago

alihankeleser commented 1 year ago

Hello,

First of all, thanks for your contribution!

I am trying to create a custom dataset of different objects which are NOT present in shapenet database from scratch. For that, I am trying to imitate the dataset structure that you shared. I am using BlenderProc to create the synthetic data. I have also downloaded the dataset which you used for training and analysed it.

I have a several doubts regarding the creation of different files in the dataset:

  1. How to create a depth image exactly like CAMERA dataset (for e.g. CAMERA/train/0000_depth.png) so that it looks like this:

image

Is there any script present in the repository to create this depth image? I did not understand how the depth information is encoded in an RGB image.

How to create a depth image exactly like in camera_full_depth folder (for e.g. camera_full_depth/train/0000/0000_composed.png) :

image

  1. How to create bbox.txt for each object file in obj_models?

  2. how to generate camera_train.pkl and camera_val.pkl in obj_models?

  3. Why is mug_meta.pkl present in the obj_models folder?

  4. How to create norm.txt and norm_vertices.txt in obj_models/real_train?

  5. How to generate ‘Results’ folder and all the files in it?

  6. In ‘sdf_rgb_pretrained’ folder, how to generate the ‘Latent Codes’ and all_train_ids.json inside it?

  7. How to generate .pkl files in ‘gts’ folder?

  8. Do we need to store 6D poses of all the objects in the scene somewhere as annotations for all the images?

Thank you in advance for your answers.

zubair-irshad commented 1 year ago

Thanks @alihankeleser for your interest in our work. I can help out as much as I can (please see responses below) but I would ask you to refer to the original nocs repo and or create a issue on NOCS github as we only use their provided data and do not create any data from scratch on our own. Having said this, here are my answers:

  1. Please contact original NOCS repo (see above) for this question. To train ShAPO, you do need object-centric depth but rather scene-depth. Please see my answers on this thread https://github.com/zubair-irshad/CenterSnap/issues/17 as I also have attached a sample depth image that you need to train ShAPO. That image is exactly what we use from camera_full_depths although with color coded for visualization. I think you can get this from any rendering software i.e. just saving the whole scene depth.

  2. bbox.txt stores the size information for CAD models i.e. the canonical aspect ratio of CAD models. I think you don't need this information explicitly. All you need to train the ShAPO model is supervised poses i.e. R,T,s (rotation matrix, translation vector and sizes and scales). You could save the object poses and relevant rgb+depth in this datapoint format for any custom dataset and not have to follow the exact same conventions on how NOCS saved their data.

  3. We use this script to get camera_train.pkl and camera_val.pkl files. Note that this requires access to CAD models.

  4. Please refer to this script. Mug CAD models in shapenet are not centered at origin and are slightly shifted. This is saved in advance to account for that. NO other category need this file.

  5. Please see my answer to 2.

  6. Results folder contains ground truth poses (please see my answer to 2.) and we use it as here. You don't need this if you have access to 6D poses (i.e. R,T,s) in any other form.

  7. We don't release the shape and texture pretraining part of our codebase. Please see this issue https://github.com/zubair-irshad/shapo/issues/9. If you are interested, I can help you reproduce this part as much as possible (Do you mind creating a separate thread/issue for it?)

  8. Question for NOCS authors. But this again stores ground truth poses for eval and is only needed for evaluation. You can see how it is uses here

  9. Correct, please see my responses to 2.

Hope it helps!

alihankeleser commented 1 year ago

Hi @zubair-irshad,

This definitely helps.

After going through the code, I found out some things which might be helpful for others:

  1. 'depth.png' from the original CAMERA dataset is not needed at all, only the camera_full_depths folder is required as you mentioned. I just needed to delete the following lines of code in the file generate_training_data.py : Screenshot from 2023-01-26 15-34-07 Screenshot from 2023-01-26 15-33-20

  2. We could generate camera_train.pkl and camera_val.pkl from the script that you shared with access to CAD models. But, we got an error because some of the .obj files had a quad mesh which we had to convert into triangular mesh in Blender to generate those .pkl files.

I am generating another issue on how to create 'sdf_rgb_pretrained' folder and how to pretrain the latent code network with the custom data as you mentioned. I cannot proceed to train the network wihtout that data.

Thanks a lot for your answers and support!

zubair-irshad commented 1 year ago

Great to know that you were able to get things to work. Do you mind creating a pull request for points for 1. and 2.? It would help the community if we mention this in the readme as well.

Thanks!

Trulli99 commented 1 year ago

Hi @zubair-irshad,

This definitely helps.

After going through the code, I found out some things which might be helpful for others:

1. 'depth.png' from the original CAMERA dataset is not needed at all, only the camera_full_depths folder is required as you mentioned. I just needed to delete the following lines of code in the file generate_training_data.py :
   ![Screenshot from 2023-01-26 15-34-07](https://user-images.githubusercontent.com/73790139/214863082-b33ee599-7380-4fe7-b09f-e2c82b7fe979.png)
   ![Screenshot from 2023-01-26 15-33-20](https://user-images.githubusercontent.com/73790139/214863312-7e59903f-fc3b-4a6f-a66c-f6588e33efba.png)

2. We could generate camera_train.pkl and camera_val.pkl from the script that you shared with access to CAD models. But, we got an error because some of the .obj files had a quad mesh which we had to convert into triangular mesh in Blender to generate those .pkl files.

I am generating another issue on how to create 'sdf_rgb_pretrained' folder and how to pretrain the latent code network with the custom data as you mentioned. I cannot proceed to train the network wihtout that data.

Thanks a lot for your answers and support!

Does your error with the meshes was in the load_obj function? Like this: Traceback (most recent call last): File "shape_data.py", line 214, in save_nocs_model_to_file(obj_model_dir) File "shape_data.py", line 39, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "D:\shapo\Nova pasta\utils.py", line 149, in sample_points_from_mesh vertices, faces = load_obj(path) File "D:\shapo\Nova pasta\utils.py", line 53, in load_obj face = [int(idx.split('/')[0])-1 for idx in face] File "D:\shapo\Nova pasta\utils.py", line 53, in face = [int(idx.split('/')[0])-1 for idx in face] ValueError: invalid literal for int() with base 10: 'ormat'

Thank you!

zubair-irshad commented 1 year ago

Hi @Trulli99, Glad you find the answers here helpful. Looks like the load_obj script throwing an error mean that some of your meshes might not be watertight. You can try to make them watertight using this or any other open source code or try loading objs using trimesh etc.

peng25zhang commented 1 year ago

@Trulli99 hi, do you know how to creat norm.txt, i want to know how to cal scale_factors.. please help

zubair-irshad commented 1 year ago

@peng25zhang if you have the CAD models of real-world images, scale-factors could be calculated by the bounds of the CAD model or pointclouds like this where bbox_dims are the tight bounding box of the real-world CAD model or pointcloud which determines the 3dimensional extent of the shape i.e. W,H,L values of the shape.