ethnhe / PVN3D

Code for "PVN3D: A Deep Point-wise 3D Keypoints Hough Voting Network for 6DoF Pose Estimation", CVPR 2020
MIT License
488 stars 105 forks source link

Pose Prediction #28

Closed jako-dev closed 4 years ago

jako-dev commented 4 years ago

Hi,

i trained PVN3D on my own dataset (generated with NDDS). I'm trying to visualize it with your demo.py script. It works, but the pose is wrong. Here are some examples:

My Dataset: 1 6

Do you have any idea what could have gone wrong? If you need any more information about the dataset, just let me know.

I'd be really grateful if you could help me out. Thank you

ethnhe commented 4 years ago

Don't know what's wrong with the provided information. You can first check if the input data is correct. By visualizing the sampled points from the scene, the keypoints of the object, and so on, using scripts we provide in *dataset.py.

jako-dev commented 4 years ago

train_rgb_screenshot_24 06 2020 nrm_map_screenshot_24 06 2020_2

Thats the visualization, which seems to be correct, doesnt it?

ethnhe commented 4 years ago

The keypoints seem correct but why is there so much black region? Anyway, you can check if the input segmentation and keypoint offset are correct by using the GT segmentation and offset as input to the pose estimation module. If it works well, use GT segmentation and predicted offset or predicted segmentation and GT offset to check if the two modules of the model are well trained.

jako-dev commented 4 years ago

I filtered the background, my bad. I changed it. It now looks like the linemod visualization. Here is my last latest training loss. The loss seems to be really high.

There is still some progress, so i'll wait- === Training Progress ===
acc_rgbd --- train: 0.9818 val: 0.9959
loss_kp_of --- train: 462.4043 val: 409.4338
loss_target --- train: 494.3592 val: 429.5899
loss_rgbd_seg --- train: 0.0453 val: 0.0116
loss --- train: 494.3592 val: 429.5899
loss_ctr_of --- train: 31.8644 val: 20.1328

=== Training Progress ===
acc_rgbd --- train: 0.9963 val: 0.9977
loss_kp_of --- train: 405.1292 val: 395.5365
loss_target --- train: 426.0651 val: 413.6031
loss_rgbd_seg --- train: 0.0103 val: 0.0050
loss --- train: 426.0651 val: 413.6031
loss_ctr_of --- train: 20.9152 val: 18.0565

ethnhe commented 4 years ago

Yes, so I think your input data have some error, maybe the ground truth offset to the keypoints. Did you try the suggestion?

you can check if the input segmentation and keypoint offset are correct by using the GT segmentation and offset as input to the pose estimation module. If it works well, use GT segmentation and predicted offset or predicted segmentation and GT offset to check if the two modules of the model are well trained.

jako-dev commented 4 years ago

Can you specify a little more on how to that?

ethnhe commented 4 years ago

The GT labels are those you get from the input data, saying kp_targ_ofst, ctr_targ_ofst and labels in demo.py, use them to replace the predicted label & offset before calling cal_frame_poses*. But make sure that they are in the same shape as the predicted one, maybe you should do permute operation.

jako-dev commented 4 years ago

Thank you very much. I will try and let you know

jako-dev commented 4 years ago

1

Thats the result, when i use GT information and feed it into cal_frame_poses

`kp_targ_ofst = kp_targ_ofst.permute(0,2,1,3)

add axis

            ctr_targ_ofst = ctr_targ_ofst.unsqueeze(1)

            pred_cls_ids, pred_pose_lst = cal_frame_poses(
                    pcld[0], labels[0], ctr_targ_ofst[0], kp_targ_ofst[0], True,
                    config.n_objects, True
            )`

So i guess there seems to be an issue with the GT information right?

ethnhe commented 4 years ago

Yes, it seems there is something wrong with your input data, trace and fix it.

KatharinaSchmidt commented 4 years ago

@jako-dev, did you try to visualize your annotated data from NDDS with NVDU? That was my way to check, wether my synthesized data are annotated correctly or not. Furthermore I am still interested in your solution to use the data, which you synthesized with NDDS. Since the provided code has no comments, for me it's hard to see where I have to adapt the pvn3d-code to my dataset.

jako-dev commented 4 years ago

I visualized the keypoints by myself. Also they seem to be correct since they get transformed and drawn correctly with my adapted NDDS dataset script for pvn3d. The RT matrix and K seem to be correct also.

I think the error comes from the calculation of the target_offset (calculated by adding the pcld with -1.0*keypoints). Since i think that keypoints are correct, it might be the (p)cld. I couldn‘t find the error yet unfortunately. The pcld gets calculated from the depth image if i understood correctly.

@ethnhe is the format of the depth image important (like every pixel is depth in mm, cm or something else?) What about the cam_scale (is this the maximum depth of the image?)?

I will post some examples of my pcld, keypoints and target_offset arrays as soon as i‘m back at my computer, maybe you can spot some errors there.

@KatharinaSchmidt you have to use the gt information from the json file per image and use these information at the right place. I followed the YCBDataset.py script very closely and only exchanged the gt information with the information from the json files (and some more preprocessing).

Thank you.

jako-dev commented 4 years ago

I finally managed to fix the gt information. 9 4

KatharinaSchmidt commented 4 years ago

@jako-dev Could you show us, how you fixed the wrong information? I still try to adapt the dataset.py script for my own dataset. By the way, how did you get the keypoints of your 3D modells?

jako-dev commented 4 years ago

@KatharinaSchmidt I used parts of the clean-pvnet repo code (especially handle_custom_dataset.py) to read the .ply file and create the corners.txt and the farthest.txt. The wrong pose was due to an error coming from left-hand coordinate system and right-hand-coordinate system. I flipped the y-axis of the mesh_kps before feeding it into best_fit_transform. I had some more issues while debugging but can't recall all of them. As soon as i validated the training results and cleaned up the code i can send it to you / upload it to github or so

Michael187-ctrl commented 4 years ago

Would be really nice if you can upload it to github! I am following this thread every day :)

Michael187-ctrl commented 4 years ago

Hey @jako-dev did you make progress and does it work properly now?

jako-dev commented 4 years ago

Yes, it's working okay. The results on my dataset are not so great, but i think everything seems to working as it should be. I can't find any more issues

jako-dev commented 4 years ago

I will upload it as soon as i can. it might take a few days

Michael187-ctrl commented 4 years ago

Thanks :)

Ixion46 commented 4 years ago

@jako-dev , can you please tell me how to get the corners.txt files in the YCB-Dataset , for example https://github.com/ethnhe/PVN3D/tree/master/pvn3d/datasets/ycb/ycb_object_kps/004_sugar_box ? Because I cant derive them from the .ply models like in the linemod Dataset, because the YCB only got .xyz and .obj and with both of them I get other values than in the corners.txt . Best regards Ixion

jako-dev commented 4 years ago

Sorry, i got the corners.txt by using the points from the ply file.

Ixion46 commented 4 years ago

Okay thank you @jako-dev, gonna try it the same way then. And the different farthest.txt's are all obtained with the farthest key point sampling algorithm and just differs in the number of used points , right?

jako-dev commented 4 years ago

@Ixion46 yes that seems to be it.

Ixion46 commented 4 years ago

@jako-dev Really thank you for your great help. To use that code for my own objects isnt that easy for me, so I am happy that there are people that help.

KatharinaSchmidt commented 4 years ago

@jako-dev I have the same issue with wrong gt when I try to train with clean-pvnet. Did you take the rotation matrix from 'quaternion_xyzw' or the first 3x3 matrix from 'pose_transform'? If I understand you correct, you flipped the y-axis (of the location?) for 180°, so just multiply with -1, correct? Also I want to ask, as if you changed the scale unit of location values?

Ixion46 commented 4 years ago

@jako-dev or @KatharinaSchmidt is there still the possibility that one of you uploads his changes to github? That would help a lot with understanding how to do it and would also eliminate the most questions. Greetings Ixion

KatharinaSchmidt commented 4 years ago

@Ixion46 I created a fork of PVNet, there is my code for reading the pose out of .json-file and other data generated with NDDS. fork-repositroy

ssh10032 commented 2 years ago

@jako-dev @KatharinaSchmidt Can I ask how to get RT matrix from NDDS annotation file(JSON)?

I got RT matrix using object's pose_transform value, and visualize keypoint on RGB Image but it seems not correct.

12

I also change coordinate system left handed(Unreal Engine 4) to right handed(PVN3D). Here's my visualization results.

linemod cat 회전 microwave 돌린거 microwave 정면

I read all of your github issues about RT matrix value, but can't get answer. I've been struggling with this almost one months. If you know something, please let me know. Thank you.