This is the implementation of our CVPR'19 " HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation" (project page).
dataset.py
, inference.py
and misc/post_proc.py
)This repo is a pure python implementation that you can:
Pytorch installation is machine dependent, please install the correct version for your machine. The tested version is pytorch 1.8.1 with python 3.7.6.
data
directory so you should get:
HorizonNet/
├──data/
| ├──layoutnet_dataset/
| | |--finetune_general/
| | |--test/
| | |--train/
| | |--valid/
test
, train
, valid
are processed from LayoutNet's cuboid dataset.finetune_general
is re-annotated by us from train
and valid
. It contains 65 general shaped rooms.Plase download the pre-trained model here
resnet50_rnn__panos2d3d.pth
resnet50_rnn__st3d.pth
resnet50_rnn__zind.pth
layout_visible
, is_primary
, is_inside
, is_ceiling_flat
.In below explaination, I will use assets/demo.png
for example.
assets/demo.png
by firing below command.
python preprocess.py --img_glob assets/demo.png --output_dir assets/preprocessed/
--img_glob
telling the path to your 360 room image(s).
"my_fasinated_img_dir/*png"
).--output_dir
telling the path to the directory for dumping the results.python preprocess.py -h
for more detailed script usage help.--output_dir
, you will get results like below and prefix with source image basename.
[SOURCE BASENAME]_aligned_rgb.png
and line segments images [SOURCE BASENAME]_aligned_line.png
demo_aligned_rgb.png |
demo_aligned_line.png |
---|---|
[SOURCE BASENAME]_VP.txt
(Here demo_VP.txt
)
-0.002278 -0.500449 0.865763
0.000895 0.865764 0.500452
0.999999 -0.001137 0.000178
python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize
--pth
path to the trained model.--img_glob
path to the preprocessed image.--output_dir
path to the directory to dump results.--visualize
optinoal for visualizing model raw outputs.--force_cuboid
add this option if you want to estimate cuboid layout (4 walls).[SOURCE BASENAME].raw.png
[SOURCE BASENAME].json
{"z0": 50.0, "z1": -59.03114700317383, "uv": [[0.029913906008005142, 0.2996523082256317], [0.029913906008005142, 0.7240479588508606], [0.015625, 0.3819984495639801], [0.015625, 0.6348703503608704], [0.056027885526418686, 0.3881891965866089], [0.056027885526418686, 0.6278984546661377], [0.4480381906032562, 0.3970482349395752], [0.4480381906032562, 0.6178648471832275], [0.5995567440986633, 0.41122356057167053], [0.5995567440986633, 0.601679801940918], [0.8094607591629028, 0.36505699157714844], [0.8094607591629028, 0.6537724137306213], [0.8815288543701172, 0.2661873996257782], [0.8815288543701172, 0.7582473754882812], [0.9189453125, 0.31678876280784607], [0.9189453125, 0.7060701847076416]]}
python layout_viewer.py --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling
--img
path to preprocessed image--layout
path to the json output from inference.py
--ignore_ceiling
prevent showing ceilingpython layout_viewer.py -h
for usage help.See tutorial on how to prepare it.
To train on a dataset, see python train.py -h
for detailed options explaination.\
Example:
python train.py --id resnet50_rnn
--id
required. experiment id to name checkpoints and logs--ckpt
folder to output checkpoints (default: ./ckpt)--logs
folder to logging (default: ./logs)--pth
finetune mode if given. path to load saved checkpoint.--backbone
backbone of the network (default: resnet50)
{resnet18,resnet34,resnet50,resnet101,resnet152,resnext50_32x4d,resnext101_32x8d,densenet121,densenet169,densenet161,densenet201}
--no_rnn
whether to remove rnn (default: False)--train_root_dir
root directory to training dataset. (default: data/layoutnet_dataset/train
)--valid_root_dir
root directory to validation dataset. (default: data/layoutnet_dataset/valid/
)
{ckpt}/{id}/best_valid.pth
--batch_size_train
training mini-batch size (default: 4)--epochs
epochs to train (default: 300)--lr
learning rate (default: 0.0001)--device
set CUDA enabled device using device id (not to be used if multi_gpu is used)--multi_gpu
enable parallel computing on all available GPUsTo evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:
python inference.py --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir output/panos2d3d/resnet50_rnn/ --force_cuboid
--img_glob
shell-style wildcards for all testing images.--output_dir
path to the directory to dump results.--force_cuboid
enfoce output cuboid layout (4 walls) or the PE and CE can't be evaluated.To get the quantitative result:
python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"
--dt_glob
shell-style wildcards for all the model estimation.--gt_glob
shell-style wildcards for all the ground truth.If you want to:
python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"
python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"
:clipboard: The quantitative result for the released resnet50_rnn__panos2d3d.pth
is shown below:
Testing Dataset | 3D IoU(%) | Corner error(%) | Pixel error(%) |
---|---|---|---|
PanoContext | 83.39 |
0.76 |
2.13 |
Stanford2D3D | 84.09 |
0.63 |
2.06 |
All | 83.87 |
0.67 |
2.08 |
@inproceedings{SunHSC19,
author = {Cheng Sun and
Chi{-}Wei Hsiao and
Min Sun and
Hwann{-}Tzong Chen},
title = {HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch
Data Augmentation},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR}
2019, Long Beach, CA, USA, June 16-20, 2019},
pages = {1047--1056},
year = {2019},
}