hughw19 / NOCS_CVPR2019

[CVPR2019 Oral] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation on Python3, Tensorflow, and Keras
https://geometry.stanford.edu/projects/NOCS_CVPR2019/
Other
441 stars 71 forks source link

How can I generate customize dataset? #36

Closed Patlon closed 3 years ago

Patlon commented 3 years ago

I can get color image, mask image and depth image, but how can i generate coord image? And what's the meaning of *_meta.txt? For example, val/0000/0000_meta.txt:

1 5 03642806 fdec2b8af5dd988cef56c22fd326c67 2 5 03642806 fae8d4929159d0df7d14ad32b7473fd2 3 1 02876657 e4915635a488cbfc4c3a35cee92bb95b 4 3 02942699 ee58b922bd93d01be4f112f1b3124b84 5 0 02992529 e3e43df4a3fc1870d4d7de9e3c2bb6f4 6 0 00000000 83ab18386e87bf4efef598dabc93c115 7 5 03642806 e9e28a11f71337fc201115f39f20d1ff 8 0 02992529 f75e6866c8965374f1c81a6df84170c9 9 1 02876657 ffa6c49aa8f7ec19971e7f8dbfabf375 10 0 00000000 fffc2fc5a90254de9e317a6cc797b629 11 0 02954340 f40b47fcbf83b962f0d11ae402ef940e 12 0 03211117 93e260150e7788ab574b5684eaec43e9 13 0 00000000 81bd0c7a35a147988cc3ae4061da3bb0 14 6 03797390 fd1f9e8add1fcbd123c841f9d5051936 15 0 02992529 df23ff3151aa1f0a3cb2f20e21cb06ff

Is this correct?: first number = instance counter; second number = object class id; third number = shapenet class id; fourth 'number' = shapenet object id. Why are there 16 lines? There are only four objects in the image.

What is bbox.txt in the object folders? (I am guessing these are the object dimensions. The numbers are the distances from the centre point of the object measured in centimeters?)

And what is stored in .pkl file?

Could you please tell me how to generate customize dataset for training? Which one is necessary and how to organize them? Thank you very much

hughw19 commented 3 years ago

Hi Palton,

The meta.txt contains some raw information from scene generation stage. As you can see, there are more objects that the actual object in the scene. This is because we randomly generate an object and detect its collision with existing objects on the table. We only keep the object if there is no collision. However, all the objects used in the process seem to log to this file. Our dataloader can correctly handle these data and you can take a further look to understand how we do it.

You are correct about bbox.txt.

Roughly speaking, you need to detect the plane in the depth image and use it as the table plane, and then randomly select objects and put it onto the table. You can use Blender to render NOCS map and segmentation mask. Unfortunately, I don’t have a Blender code to do it but I know it is doable since someone else has done similar things to NOCS map rendering in Blender.

Best, He

hughw19 commented 3 years ago

I just added an example code for using Blender to render a NOCS map. You may want to take a look.

Patlon commented 3 years ago

I just added an example code for using Blender to render a NOCS map. You may want to take a look.

Thanks for your reply. And Does the scale in your example code affect the nocs map? If I scale the model in my scene, should I modify the corresponding scale in your example code? I saw that in your data set, the depth map is a color map. In what way is it transformed? Do different conversion methods affect the results? Thanks.

hughw19 commented 3 years ago

NOCS needs always to be normalized. You can first normalize your model, compute the vcolor layer, and then load the normalized model and transform the model into your scene. The GT pose is the transformation. In my example code, I didn’t perform normalization but only compute the normalization coefficient before computing the vcolor layer, which is not recommended.

The input to my network is RGB images not depths. Depths are only used for pose fitting. If you want to process depth only, you can refer to my work, Category-level articulated object pose estimation.

Best, He

On Nov 27, 2020, at 12:10 AM, Patlon notifications@github.com wrote:

 I just added an example code for using Blender to render a NOCS map. You may want to take a look.

Thanks for your reply. And Does the scale in your example code affect the nocs map? If I scale the model in my scene, should I modify the corresponding scale in your example code? I saw that in your data set, the depth map is a color map. In what way is it transformed? Do different conversion methods affect the results? Thanks.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

Patlon commented 3 years ago

NOCS needs always to be normalized. You can first normalize your model, compute the vcolor layer, and then load the normalized model and transform the model into your scene. The GT pose is the transformation. In my example code, I didn’t perform normalization but only compute the normalization coefficient before computing the vcolor layer, which is not recommended. The input to my network is RGB images not depths. Depths are only used for pose fitting. If you want to process depth only, you can refer to my work, Category-level articulated object pose estimation. Best, He On Nov 27, 2020, at 12:10 AM, Patlon @.***> wrote:  I just added an example code for using Blender to render a NOCS map. You may want to take a look. Thanks for your reply. And Does the scale in your example code affect the nocs map? If I scale the model in my scene, should I modify the corresponding scale in your example code? I saw that in your data set, the depth map is a color map. In what way is it transformed? Do different conversion methods affect the results? Thanks. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

I use the obj model in ShapeNetCore so I think I do not need to normalize it again. And I just need to add a layer of material that has the same color value as the coordinate at each vertex? I think I shoulud write a code like this: color = item.data.vertices[loop_vert_index].co, is it correct? Do I need to add a vector [1,1,1] as mentioned in your example code color = scale*item.data.vertices[loop_vert_index].co + Vector([0.5, 0.5, 0.5])

Patlon commented 3 years ago

NOCS needs always to be normalized. You can first normalize your model, compute the vcolor layer, and then load the normalized model and transform the model into your scene. The GT pose is the transformation. In my example code, I didn’t perform normalization but only compute the normalization coefficient before computing the vcolor layer, which is not recommended. The input to my network is RGB images not depths. Depths are only used for pose fitting. If you want to process depth only, you can refer to my work, Category-level articulated object pose estimation. Best, He On Nov 27, 2020, at 12:10 AM, Patlon @.***> wrote:  I just added an example code for using Blender to render a NOCS map. You may want to take a look. Thanks for your reply. And Does the scale in your example code affect the nocs map? If I scale the model in my scene, should I modify the corresponding scale in your example code? I saw that in your data set, the depth map is a color map. In what way is it transformed? Do different conversion methods affect the results? Thanks. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

I'm reading your code recently and it help me a lot. I have some questions about depth map. In dataset.py, what's the meaning of the code below in load_depth function?

This is encoded depth image, let's convert depth16 = np.uint16(depth[:, :, 1]*256) + np.uint16(depth[:, :, 2])

So what is the meaning of the pixels of the depth map in your dataset, and what encoding algorithm is used to become what we see 0000_depth

hughw19 commented 3 years ago

This is the standard way to use the last two channel of an RGB image to store a 16-bit depth.

I think the code can manifest its rule.

Best, He

On Dec 1, 2020, at 1:50 AM, Patlon notifications@github.com wrote:

NOCS needs always to be normalized. You can first normalize your model, compute the vcolor layer, and then load the normalized model and transform the model into your scene. The GT pose is the transformation. In my example code, I didn’t perform normalization but only compute the normalization coefficient before computing the vcolor layer, which is not recommended. The input to my network is RGB images not depths. Depths are only used for pose fitting. If you want to process depth only, you can refer to my work, Category-level articulated object pose estimation. Best, He … <x-msg://2/#> On Nov 27, 2020, at 12:10 AM, Patlon @.***> wrote:  I just added an example code for using Blender to render a NOCS map. You may want to take a look. Thanks for your reply. And Does the scale in your example code affect the nocs map? If I scale the model in my scene, should I modify the corresponding scale in your example code? I saw that in your data set, the depth map is a color map. In what way is it transformed? Do different conversion methods affect the results? Thanks. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

I'm reading your code recently and it help me a lot. I have some questions about depth map. In dataset.py, what's the meaning of the code below in load_depth function?

This is encoded depth image, let's convert depth16 = np.uint16(depth[:, :, 1]*256) + np.uint16(depth[:, :, 2]) So what is the meaning of the pixels of the depth map in your dataset, and what encoding algorithm is used to become what we see https://user-images.githubusercontent.com/19151354/100724100-7786d680-33fd-11eb-9e9b-ba302e708bb3.png — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/hughw19/NOCS_CVPR2019/issues/36#issuecomment-736362273, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXSGH3LWN3GV3AO6OA7PA3SSS34ZANCNFSM4TTBLC2A.

peng25zhang commented 1 year ago

@Patlon hi, do you train custom data successed?