Open mittalrajat opened 6 years ago
Hi @mittalrajat , Yes, you are right.
@mittalrajat The essence of this paper is not end to end. So either you need to pass the entire image to a 2D detector and then use the outputs of it and use the trained model to get the dimensions and yaw. Correct me if I am wrong.
@345ishaan Yes, that seems correct.
@345ishaan That would be very slow to generate every 3d box for every detected image patch.
Hi @smallcorgi,
Thanks for providing us with your code. As I look through your code and look at the issues section, I see that there are some differences between the actual implementation and your code. Would it be possible for you to list down the differences as I feel that it would be really for those who use your code for their implementation. Thanks.
Some differences that I observed are:
I am currently trying to understand the paper, so I apologise if things that I suggested turn out to be incorrect.
Thanks again.