zouchuhang / LayoutNet

Torch implementation of our CVPR 18 paper: "LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image"
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zou_LayoutNet_Reconstructing_the_CVPR_2018_paper.pdf
MIT License
419 stars 93 forks source link

3D ground truth interpretation #13

Open teasherm opened 6 years ago

teasherm commented 6 years ago

Hello,

First, thank you for sharing this great work. A quick question about the panoContext_box_train.t7 tensor:

The paper mentions 6 ground truth 3D parameters: sw, sl, sh, tx, tz, r_theta. The first 6 elements in the box tensor above (box[{{1}{1}{1,6}}]), which I believe contain those parameters for the first example image, read:

sw = -0.5154072972870558
sl = -0.6748731674025037
sh = -1.316387492900166
tx = -0.24216556285261603
tz = -0.2114205765327388
r_theta = 0.08283438070600802

A naive interpretation would suggest that the room is almost 3x higher than it is wide? Is there a reason for the negative scale factors? Any guidance on interpretation would be much appreciated

zouchuhang commented 6 years ago

@teasherm The box parameter stored in "panoContext_box_train.t7" are normalized to be zero mean and standard deviation, causing those negative scale factors. I include the preprocessing script in "preprocessPano.m", you can refer to L94-239 for computing the box parameters.

teasherm commented 6 years ago

Ah, I see. Thanks @zouchuhang !