Yingshu Chen1,
Tuan-Anh Vu1,
[Ka-Chun Shum]()1,
Binh-Son Hua2,
Sai-Kit Yeung1
1The Hong Kong University of Science and Technology, 2 VinAI Research
Abstract: Architectural photography is a genre of photography that focuses on capturing a building or structure in the foreground with dramatic lighting in the background. Inspired by recent successes in image-to-image translation methods, we aim to perform style transfer for architectural photographs. However, the special composition in architectural photography poses great challenges for style transfer in this type of photographs. Existing neural style transfer methods treat the architectural images as a single entity, which would generate mismatched chrominance and destroy geometric features of the original architecture, yielding unrealistic lighting, wrong color rendition, and visual artifacts such as ghosting, appearance distortion, or color mismatching. In this paper, we specialize a neural style transfer method for architectural photography. Our method addresses the composition of the foreground and background in an architectural photograph in a two-branch neural network that separately considers the style transfer of the foreground and the background, respectively. Our method comprises a segmentation module, a learning-based image-to-image translation module, and an image blending optimization module. We trained our image-to-image translation neural network with a new dataset of unconstrained outdoor architectural photographs captured at different magic times of a day, utilizing additional semantic information for better chrominance matching and geometry preservation. Our experiments show that our method can produce photorealistic lighting and color rendition on both the foreground and background, and outperforms general image-to-image translation and arbitrary style transfer baselines quantitatively and qualitatively.
Tested with:
Others:
git clone https://github.com/hkust-vgd/architectural_style_transfer.git
cd architectural_style_transfer/translation
translation/checkpoints
:
bash checkpoints/download_models.sh
bash test_script.sh
Segmentation map contains only two labels, white color for foreground, black color for background (i.e., sky). See samples in translation/inputs/masks
and translation/training_samples/masks
.
You can manually label sky as background, remaining as foreground.
At testing, manual labeling for input source image is recommended for better blended results.
We used pretrained model (ResNet50dilated + PPM_deepsup
) to label sky background for training and evaluation data as described in the paper.
Please access this repository for details.
translation/checkpoints
.<TEST_ROOT>/day
and <TEST_ROOT>/<TARGET_CLASS>
.<MASK_ROOT>/day
and <MASK_ROOT>/<TARGET_CLASS>
.CUDA_VISIBLE_DEVICES=1 python test.py \
--test_root <TEST_ROOT_DIR> \
--mask_root <MASK_ROOT_DIR> \
-a day \
-b <TARGET_CLASS> \
--output_path results \
--config_fg checkpoints/config_day2golden_fg.yaml \
--config_bg checkpoints/config_day2golden_bg.yaml \
--checkpoint_fg checkpoints/gen_day2golden_fg.pt \
--checkpoint_bg checkpoints/gen_day2golden_bg.pt \
--new_size <NEW_SIZE> \
--opt
You can view results in html by running:
python gen_html.py -i ./results
Training is tested in NVIDIA GeForce RTX 2080 Ti with 11GB memory with one single GPU under 256x256 resolution, and in NVIDIA GeForce RTX 3090 with 24GB memory with one single GPU under 512x512 resolution (batch=1).
day
and golden
.python mask_images.py \
--img_dir training_samples/<CLASS_NAME> \
--mask_dir training_samples/masks/<CLASS_NAME> \
--class_name <CLASS_NAME> \
--output_dir training_samples \
--kernel_size 0
translation/configs/XXXX.yaml
.CUDA_VISIBLE_DEVICES=0 python train.py \
--config configs/day2golden_fg.yaml \
--save_name day2golden_fg
You can run blending optimization solely after image translation based on translated results, for example:
cd optimization
python blend_opt.py \
--blended_dir ../translation/results/ \
--src_dir ../translation/inputs/images/day/ \
--mask_dir ../translation/inputs/masks/day \
--origin_res
The Time-lapse Architectural Style Transfer dataset is released for :warning:non-commercial:warning: use only.
The dataset is manually classified into four classes of time-of-day styles: day
,golden
,blue
, night
.
Training set: A request form is required to be filled for training data access (7.4GB).
Evaluation set:
The evaluation set contains 1,003 images in four time styles.
Evaluation set used in the paper: Download Link (550MB).
If you want to get evaluation images in original high resolution with source information, please download data here: Download Link (2.2GB). Please check image original sources for other usages (e.g., commercial use).
Segmentation maps:
Please refer to Data Segmentation Processing for data processing details. For inference, manually labeling is recommended.
You can download labeled training and evaluation segmentation maps used in the paper: Training maps download link (224MB), Evaluation maps download link (11MB).
If you find our work or data useful in your research, please consider citing:
@inproceedings{chen2022timeofday,
title={Time-of-Day Neural Style Transfer for Architectural Photographs},
author={Chen, Yingshu and Vu, Tuan-Anh and Shum, Ka-Chun and Hua, Binh-Son and Yeung, Sai-Kit},
booktitle={International Conference on Computational Photography (ICCP)},
year={2022},
organization={IEEE}
}
Github issues are welcomed. You can also drop an email to yingshu2008[AT]gmail[DOT]com.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.