NVlabs / latentfusion

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation
https://arxiv.org/pdf/1912.00416.pdf
Other
213 stars 34 forks source link

Unable to reproduce the results on LineMOD as the paper #6

Closed wujun-cse closed 4 years ago

wujun-cse commented 4 years ago

Hi,

Thx for the excellent work!

Since you didn't include evaluation code in this project, I carried out evaluation on LineMOD according to the examples you provided (pose_estimation.ipynb). However, I cannot reproduce the results as the paper shows in Table 1. For example, on camera class(4), I only got a ADD recall about 54% (ADD/diameter<=0.1).

Could you please release the evaluation code you used as in the paper? Or is there anything I did wrong? Or should I fine tune the model more?

PS: My settings are: model: the model you provided in this project dataset: downloaded from BOP website input size: 256x256 num_views: 16 input set: lm/train/000004/ test set: lm/test/000004/

keunhong commented 4 years ago

Hi Wujun,

Here's a gist containing the code I used to do the evaluation: https://gist.github.com/keunhong/cdf7b9cb4c1f08d91394a1c8b16b52f0

I just want to warn you that it's not release-quality code. You might need to fix some of the imports and some of the flags don't do anything.

For LINEMOD you should use --coarse-config configs/cross_entropy_linemod.toml and --refine-config configs/adam_slow.toml.

Thanks, Keunhong

wujun-cse commented 4 years ago

Thanks for your timely reply! It works as the paper shows on the camera class.

I suppose the difference lie in the configuration file. The one I used before contains latent loss, while this one doesn't.

keunhong commented 4 years ago

No problem.

Yes, that's the main difference. The latent loss only really helps if there are local minima that are similar such as with the toy plane or the remote.