chunfeng3364 / LARC

14 stars 0 forks source link

Could you provide any more details on VoteNet settings? #1

Open KESHEN-ZHOU opened 2 months ago

KESHEN-ZHOU commented 2 months ago

Hi there, Thank you for your great work. I have a few questions, as errors occurred when training the model.

The error I encountered currently is as follows:

File "/home/Project/LARC/datasets/referit3d/listening_dataset.py", line 231, in <listcomp>
    anchors = [object_data[i] for i in anchor_ids]
IndexError: list index out of range

As I haven't altered any training code. And I think this error basically means the anchor object is not within the detected objected data (generated from VoteNet).

So I looked back to my VoteNet setup and proposal generation procedures.

Could you provide any more details on VoteNet settings or handling this error?

KESHEN-ZHOU commented 2 months ago

My VoteNet Setting:

I followed the Votenet Guide and used the VoteNet checkpoint to generate the objects proposals, and also recalculated themean_size_arr and type_mean_size for each object (607 objects in total) in the ScanNet and hence successfully load the checkpoint with as setting with self.num_class=607, self.num_size_cluster=607.

The last object pad is a padding and is not occurred in the ScanNet instance labels, so it does not have a mean_size_arr. The batch_load_scannet_data.py and scannet_detection_dataset.py remained unchanged.

When using VoteNet checkpoint for inference and to generate the predictions, it does predict some objects:

Loaded point cloud data: /home/Project/dataset/scannetv2/scans/scene0666_00/scene0666_00_vh_clean_2.ply
Inference time: 0.070226
Finished detection. 99 object detected.

But during training, if compared with the NS3D which uses the ground truth, there are significant differences (compare with the feed_dict['input_objects_class']).

chunfeng3364 commented 2 months ago

Hi,

For issue 1 list index out of range, I think you missed a step to re-id all boxes. After getting all the predicted boxes, we need to change the groundtruth label of every sample. We assign the new groundtruth to the box with the max IOU score.

For issue 2, do you mean that the predicted boxes you get are not perfect? Could you provide some visualizations? Also, VoteNet provides a quite detailed guide on training with personalized data: https://github.com/facebookresearch/votenet/blob/main/doc/tips.md

Thank you!