News: Check out our new project HoHoNet on this task and more!\ News: Check out our new project HorizonNet on this task.
This is an unofficial implementation of CVPR 18 paper "LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image". Official layout dataset are all converted to .png
and pretrained models are converted to pytorch state-dict
.
What difference from official:
pano.py
and pano_lsd_align.py
.Overview of the pipeline:
Use this repo, you can:
assert/demo.png
for example.
ckpt/
folder.
ckpt/epoch_30_*.pth
assert/demo.png
by firing below command. See python visual_preprocess.py -h
for more detailed script description.
python visual_preprocess.py --img_glob assert/demo.png --output_dir assert/output_preprocess/
--img_glob
telling the path to your fasinated 360 room image(s).--output_dir
telling the path to the directory for dumping the results.--output_dir
, you will get results like below and prefix with source image basename.
[SOURCE BASENAME]_aligned_rgb.png
and line segments images [SOURCE BASENAME]_aligned_line.png
demo_aligned_rgb.png |
demo_aligned_line.png |
---|---|
[SOURCE BASENAME]_VP.txt
(Here demo_VP.txt
)
-0.006676 -0.499807 0.866111
0.000622 0.866128 0.499821
0.999992 -0.002519 0.003119
python visual.py --path_prefix ckpt/epoch_30 --img_glob assert/output_preprocess/demo_aligned_rgb.png --line_glob assert/output_preprocess/demo_aligned_line.png --output_dir assert/output
--path_prefix
prefix path to the trained model.--img_glob
path to the VP aligned image.--line_glob
path to the corresponding line segment image of the VP aligned image.--output_dir
path to the directory to dump the results.--flip
, --rotate 0.25 0.5 0.75
, --post_optimization
[SOURCE BASENAME]_[cor|edg].png
demo_aligned_rgb_cor.png |
demo_aligned_rgb_edg.png |
---|---|
[SOURCE BASENAME]_[bon|all].png
demo_aligned_rgb_bon.png |
demo_aligned_rgb_all.png |
---|---|
[SOURCE BASENAME]_cor_id.txt
104.928192 186.603119
104.928192 337.168579
378.994934 177.796646
378.994934 346.994629
649.976440 183.446518
649.976440 340.711731
898.234619 190.629089
898.234619 332.616364
assert/
python visual_3d_layout.py --ignore_ceiling --img assert/output_preprocess/demo_aligned_rgb.png --layout assert/output/demo_aligned_rgb_cor_id.txt
--img
path to aligned 360 image--layout
path to the txt stroing the cor_id
(predicted or ground truth)--ignore_ceiling
prevent rendering ceilingpython visual_3d_layout.py -h
/pytorch-layoutnet
/data
| /origin
| /data (download and extract from official)
| /gt (download and extract from official)
/ckpt
/panofull_*_pretrained.t7 (download and extract from official)
python torch2pytorch_data.py
to convert data/origin/**/*
to data/train
, data/valid
and data/test
for pytorch data loader. Under these folder, img/
contains all raw rgb .png
while line/
, edge/
, cor/
contain preprocessed Manhattan line segment, ground truth boundary and ground truth corner respectively.torch2pytorch_pretrained_weight.py
to convert official pretrained pano model to encoder
, edg_decoder
, cor_decoder
pytorch state_dict
(see python torch2pytorch_pretrained_weight.py -h
for more detailed). examples:
python torch2pytorch_pretrained_weight.py --torch_pretrained ckpt/panofull_joint_box_pretrained.t7 --encoder ckpt/pre_full_encoder.pth --edg_decoder ckpt/pre_full_edg_decoder.pth --cor_decoder ckpt/pre_full_cor_decoder.pth
python torch2pytorch_pretrained_weight.py --torch_pretrained ckpt/panofull_joint_box_pretrained.t7 --encoder ckpt/pre_full_encoder.pth --edg_decoder ckpt/pre_full_edg_decoder.pth --cor_decoder ckpt/pre_full_cor_decoder.pth
See python train.py -h
for detailed arguments explanation.
The default training strategy is the same as official. To launch experiments as official "corner+boundary" setting (--id
is used to identified the experiment and can be named youself):
python train.py --id exp_default
To train only using RGB channels as input (no Manhattan line segment):
python train.py --id exp_rgb --input_cat img --input_channels 3
Instead of offical 3D layout optimization with sampling strategy, this repo implement a gradient ascent optimization algorithm to minimize the similar loss of official.
The process abstract below:
cx
, cy
, dx
, dy
, theta
, h
)corner probability map |
edge probability map |
---|---|
It take less than 2 seconds on CPU and found slightly better result than offical reported.
See python eval.py -h
for more detailed arguments explanation. To get the result from my trained network (link above):
python eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666
To evaluate with gradient ascent post optimization:
python eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666 --post_optimization
exp | 3D IoU(%) | Corner error(%) | Pixel error(%) |
---|---|---|---|
Official best | 75.12 |
1.02 |
3.18 |
ours rgb only | 71.42 |
1.30 |
3.83 |
ours rgb only w/ gd opt |
72.52 |
1.50 |
3.66 |
ours | 75.11 |
1.04 |
3.16 |
ours w/ gd opt |
76.90 |
0.93 |
2.81 |
exp | 3D IoU(%) | Corner error(%) | Pixel error(%) |
---|---|---|---|
Official best | 77.51 |
0.92 |
2.42 |
ours rgb only | 70.39 |
1.50 |
4.28 |
ours rgb only w/ gd opt |
71.90 |
1.35 |
4.25 |
ours | 75.49 |
0.96 |
3.07 |
ours w/ gd opt |
78.90 |
0.88 |
2.78 |
@inproceedings{zou2018layoutnet,
title={LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image},
author={Zou, Chuhang and Colburn, Alex and Shan, Qi and Hoiem, Derek},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={2051--2059},
year={2018}
}