About testing the openad model with PointNet++ backbone on the full-shape setting on ycb dataset

Trav1slaflame commented 10 months ago

Hi, thanks for your great work. I'm interested in utilizing the openad model with PointNet++ backbone on the full-shape setting in ycb dataset. However, I encountered the following issues:

When we tested the model with real-world scale object point cloud in ycb dataset, the segmentation results from OpenAD are all 'none'. So we checked the object scale in the training dataset and found the object models used for training were all normalized. After normalizing the ycb objects, we found the OpenAD can predict affordance labels other than 'none', but the performance severely depends on the scaling factors. However, the scaling factor varies between instances, which makes it hard for us to apply OpenAD to other dataset. In this case, may I ask if you have any idea how to normalize a real-world object to a desired size for OpenAD model?
When we tested OpenAD on round objects, such as ball and apple and predict between 'grasp' and 'none' label, we found that only part of objects are labeled as 'grasp'. May I ask if it is hard for OpenAD to generalize to novel object categories?

toannguyen1904 commented 10 months ago

Hi @Trav1slaflame. Thank you very much for your interest in our work.

For your first issue, have you tried the pc_normalize function defined in AffordanceNet.py. I believe that objects in the AffordanceNet dataset are normalized using that function, in which you do not need to manually choose the scaling factor. I hope this function would help with the performance on novel real-world objects.

Regarding the second issue, I would say that due to the current limited number of object categories in the training dataset, our OpenAD may exhibit modest performance in generalizing to novel object categories whose appearances significantly differ from the categories in the training dataset. However, if you select a category that shares similar characteristics with those in the training data, our method will likely yield reasonable detection results.

Please let me know if you have any further question.

Best regards, Toan.

Trav1slaflame commented 10 months ago

Hi @toannguyen1904. Thanks for your reply and your time.

I have tried the pc_normalize function defined in AffordanceNet.py for instance Mug (mug-1) from contactdb dataset, and compare the infer result from openad model with the other Mug (mug-2) from your training dataset's val split: full_shape_val_data.pkl: mug-1: red part stands for class 'grasp', deep blue part stands for class 'none' mug_2 Fig.1. random sample 2048 points from object mesh model mug_3 Fig.2. using the same farthest_point_sample function as 3D AffordanceNet used to sample 2048 points from object mesh model mug-2: red part stands for class 'grasp', deep blue part stands for class 'wrap-grasp', and light green part stands for class 'contain' mug_1 Fig.3. Mug point cloud from full_shape_val_data.pkl. Shape ID: c34718bd10e378186c6c61abcbd83e5a

And I also keep the same scale between mug-1 and mug-2: mug-1: Fig.1. OrientedBoundingBox: center: (0.149582, 0.0641993, 0.0535746), extent: 1.53188, 1.23199, 1.59249) Fig.2. OrientedBoundingBox: center: (0.0672095, -0.0767971, 0.0501214), extent: 1.51851, 1.4498, 1.22958) mug-2: OrientedBoundingBox: center: (0.0420405, 0.0928881, 0.167135), extent: 1.64985, 1.37659, 1.23699)

For mug-1 and mug-2, I used the same val_affordance labels = ['grasp', 'contain', 'pour', 'wrap_grasp', 'none'] for testing.

As I selected a category (Mug) that shares similar characteristics with those in the training data, the openad model will likely yield reasonable detection results. However, for mug-1, I don't think I got a satisfactory result. May I ask are there any details I overlooked?

toannguyen1904 commented 10 months ago

Hi @Trav1slaflame, I truly appreciate your detailed response. I noticed that the orientation of mug-1 is different from that of mug-2. More concretely, you can see that the bottom-up axis of mug-2 aligns with the y-axis, while the bottom-up axis of mug-1 aligns with the z-axis. As our method currently does not guarantee rotation invariance (mentioned in this issue), I recommend a solution where you rotate your mug-1 to obtain a proper result. Additionally, you may consider our recommended solution (also discussed in this issue) to help the model better handle random rotations of objects.

Best regards, Toan.

Trav1slaflame commented 9 months ago

Hi @toannguyen1904, thank you so much for your suggestion. I tried to rotate the Mug-1 to have a same rotation with Mug-2 (the bottom-up axis aligns with the y-axis) and I got a proper result. However, for other kinds of objects (e.g. knife, bottle, can, etc.) in the datasets I used [1], I have the following 3 questions:

For all objects in your full-shape setting dataset (train + val), do they have the same or fixed rotation?
Is there a general way to keep the rotation of these objects [1] I used consistent with those in your training dataset?
After the rotation adjustment, how to ensure that these objects [1] have the same rotation as the objects in your training dataset so that the openad model can perform normally and get a proper result?

Thanks for your time and kind help!

toannguyen1904 commented 9 months ago

Hi @Trav1slaflame,

For your first question. As far as I remember, the objects of the same category in the dataset have a consistent orientation.

For your second and third questions. I would say that unfortunately, there is no automatic way to make sure that the rotation aligns your new object with the orientation of objects in the dataset or to keep the rotations consistent with those in training data. However, as I additionally suggested in the previous response, you can enrich your data in training by randomly rotate the object before feeding it to the network. This will help improve the performance of OpenAD in terms of dealing with different rotations of objects.

We acknowledge some limitations of our OpenAD’s current framework. We encourage you to improve it according to your need and please feel free to reach out if you need any further help.

Best regards, Toan.

Trav1slaflame commented 9 months ago

Hi @toannguyen1904,

Thank you so much for your kind reply and I appreciate your help. I will try to solve this issue.

Tz2H commented 3 months ago

@Trav1slaflame Hi! I'm trying to visualize the prediction results. Could you give me some advice on how to use the model to test on the real world objects or some related projects that can finish this job? I'll be grateful if you can reply. :)

Fsoft-AIC / Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds

About testing the openad model with PointNet++ backbone on the full-shape setting on ycb dataset #5