GAP-LAB-CUHK-SZ / Total3DUnderstanding

Implementation of CVPR'20 Oral: Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
MIT License
415 stars 50 forks source link

Cannot get the same camera pose result as in the paper #33

Open BiancaBing opened 3 years ago

BiancaBing commented 3 years ago

Hi Yinyu: I am now trying to train the layout estimation model but cannot get the same pitch and roll error as in the paper. Also I have tested the pretrained model that is provided in your repository and I got pitch error = 6.3016, roll error = 4.2779 that is not so good as in the paper (pitch error = 3.15, roll error = 2.09). I also tried the same training process as described in the paper with batch size of 32 and lr at 1e-3 (scaled by 0.5 for every 20 epochs) and in the coop repository with batch size of 32 and lr at 1e-4 ( scaled by 0.7 for every 10 epochs and finally remains at 1e-5). However, the best result I got is pitch error = 5.6497, roll error = 3.9949. I have no idea what happened. Also, the loss curve of test loss starts to increase only after a few epochs as shown below (accuracy curve here indicates the average pitch error, roll error and the sum of pitch & roll): cd410e60b09d88e896df0487bca0cd7 ea6caa99d3cc33ebc8f05b70320c44e 5d4be108c30ba767d1653588a8ab7b4 How can I get the same performance as in the paper?

yinyunie commented 3 years ago

Hi,

We inherit the layout estimation network (LEN) and corresponding loss from CooP paper. If you only train LEN and remove other modules, our network is exactly the LEN in CooP paper. I think you can check if you can get similar results with the CooP method first, and see if there are any bugs or errors.

Best, Yinyu

BiancaBing commented 3 years ago

Hi,

Thank you for your advice. I have tried the code from CooP and got the results. However, I still have some questions. First of all, I found the BIN value you use is different from CooP. It seems that you multiply the BIN by pi/180. Also, the test and train list you use is also different from CooP. Can I ask which benchmark you used when comparing the results in your paper?