zaiweizhang / H3DNet

MIT License
214 stars 24 forks source link

Questions on implementation details #18

Closed Na-Z closed 3 years ago

Na-Z commented 3 years ago

Hi Zaiwei,

Thanks for sharing your work. I have a few questions on your implementation details, which are not consistent with the paper.

  1. In Section 3.2, you mentioned that 0.2m is used to select positive points lying close to BB face or BB edge. However, 0.1 is used for SUN RGB-D dataset in the code: https://github.com/zaiweizhang/H3DNet/blob/e89d092bdf4c2ab1df576e6c4ff12bcbc2d55b2e/sunrgbd/sunrgbd_detection_dataset_hd.py#L40 How do you determine the value of this threshold? And How will it affect the performance?

  2. Also in Section 3.2, you mentioned that

The predicted attributes include a flag that indicates whether a point is close to a BB face or not and if so, an offset vector between that point and its corresponding BB face center.

However, additional surface size is predicted in the code: https://github.com/zaiweizhang/H3DNet/blob/e89d092bdf4c2ab1df576e6c4ff12bcbc2d55b2e/models/proposal_module_surface.py#L27-L34

May I know how it (i.e., adding size loss for surfaces) improves the performance, especially on SUN RGB-D?

  1. When generating ground-truths for point boundary offsets, it is possible that one point can be close to two BB faces (e.g., right and front faces). However, the code seems overwriting the offset even if one point might have already been assigned to another face (e.g., L569-575 in sunrgbd_detection_dataset_hd.py). Am I right about this? If it is true, how much will it affect the performance?

Looking forward to your reply. Thanks.

zaiweizhang commented 3 years ago

Thanks for your interests in our work!

Here are my answers to the questions:

  1. For the first question, the best way to choose the threshold is to visualize the label with a couple of scenes and then decide on the threshold. The general rule of thumb is that you want to include labels as dense as possible while not overlapping with too many points in other surfaces or edges.
  2. For the second question, we tried to eliminate it and we did not observe too much performance decrease.
  3. Yes. It will. It does not affect the performance much as long as you have dense labels in each surface. Even if for same faces, let's say right and front, there are some in-balanced distribution in the labels. It still does not affect the performance much if you have dense labels for some face.

Hope this answers your question!

Na-Z commented 3 years ago

Thanks for your reply.

I have another question: how are the l_f, l_c, l_o in Eq 2 implemented in the code (ie, loss_helper.py)? What do objectness_loss_opt and potential_loss mean?

Besides, there seems several errors in the code. Please confirm:

  1. https://github.com/zaiweizhang/H3DNet/blob/e89d092bdf4c2ab1df576e6c4ff12bcbc2d55b2e/models/proposal_module_refine.py#L82 I think it should be [:,:,start+3+num_heading_bin*2+num_size_cluster*4:]. PS: I don't understand why u use if-else when decoding size and class (ie, L68-87), the code inside the two conditions are the same.
  2. If I didn't understand wrongly, in these two pieces of code: https://github.com/zaiweizhang/H3DNet/blob/e89d092bdf4c2ab1df576e6c4ff12bcbc2d55b2e/models/loss_helper.py#L356 https://github.com/zaiweizhang/H3DNet/blob/e89d092bdf4c2ab1df576e6c4ff12bcbc2d55b2e/models/loss_helper.py#L373 'center' should be mode.
zaiweizhang commented 3 years ago

l_f is implemented in here, l_c is implemented in here and it was called here and the majority of l_o is implemented in here.

For the errors:

  1. This is actually a small trick. I found out that optimizing the semantic label on the box type works slightly better. You can change it to semantic label (what you suggested). I think it should not make much difference. From L68-79, I think you can remove the if-else. I did not fully optimize the code.
  2. For heading and size, we are only refining the angle offset or box size offset. It we are also changing the semantic type, then it will choose a different starting angle or a starting box size each time with optimization. Then, it will cause the angle offset and box size offset refinement unstable.
Na-Z commented 3 years ago

Got it. Thanks a lot.