j96w / DenseFusion

"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
https://sites.google.com/view/densefusion
MIT License
1.09k stars 300 forks source link

Get prediction from trained model with only RGB and Depth image #128

Open marmas92 opened 4 years ago

marmas92 commented 4 years ago

Thank you again for your work!! I was able to train your network with my own synthetic data (in the form of the Linemo dataset), evaluate it an visualize the result. So far so good! Now I want to check how the trained model performs on real data. How do I do that? Maybe its a stupid question... So far I always feeded the netwok all the necessary data and got the result, but now I only want to give only the RGB and Depth image and get the prediceted pose out. How does it work? Do I need to train the Segnet?

F2Wang commented 4 years ago

I second the question, could you provide a demo code to run a 6D pose estimation on a single RGB and depth image, and perhaps visualize the result?

yjdfly commented 4 years ago

I also want to know the code for that.

SebastianGrans commented 4 years ago

I am also working on adapting this to another dataset, so I've been trying to interpret the code myself.

But to ask the latter question: Yes, you need to train some form of segmentation network on your dataset. In the implementation/paper they use the same segmentation mask as from the PoseCNN paper, in order to make a fair comparison of the performance.

If you read in e.g. eval_linemod.py you can see that pose estimations are performed by the following call:

pred_r, pred_t, pred_c, emb = estimator(img, points, choose, idx)

, where estimator is an instance of the PoseNet model you have trained. If you look where img, points, choose, idx comes from (dataset.py), you will find that this is loading the segmentation from:

if self.mode == 'eval':
         self.list_label.append('{0}/segnet_results/{1}_label/{2}_label.png'.format(self.root, '%02d' % item, input_line))
balevin commented 3 years ago

@SebastianGrans Hi, I was wondering if you figured out how to do this. I am also trying to get a prediction from just RGB, Depth data. I am doing it on objects from the YCB dataset (so I don't need to do any training of my own), but am still struggling to figure out how to obtain the points, choose, idx, that I need to pass into the estimator. Any help would be greatly appreciated!

an99990 commented 2 years ago

did anyone understand the meaning of choose ? I am trying to adapt this to a custom dataset as well.

thanks for any help

valentinhendrik commented 1 year ago

Thank you again for your work!! I was able to train your network with my own synthetic data (in the form of the Linemo dataset), evaluate it an visualize the result. So far so good! Now I want to check how the trained model performs on real data. How do I do that? Maybe its a stupid question... So far I always feeded the netwok all the necessary data and got the result, but now I only want to give only the RGB and Depth image and get the prediceted pose out. How does it work? Do I need to train the Segnet?

Hey since it seemed that you got it running with your own synthetic dataset: How do you need to label the data and provide it to the network? I want to create a synthetic dataset of my own object and train the network with it