fengzhang427 / LLF-LUT

Official implementation for our NeurIPS 2023 paper β€œLookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Mapping”
MIT License
47 stars 2 forks source link

Lookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Mapping (NeurIPS 2023)

Technical Report:

image image image image

πŸš€πŸš€ Welcome to the repo of LLF-LUT πŸš€πŸš€

LLF-LUT is an effective end-to-end framework for the HDR image tone mapping task performing global tone manipulation while preserving local edge details. Specifically, we build a lightweight transformer weight predictor on the bottom of the Laplacian pyramid to predict the pixel-level content-dependent weight maps. The input HDR image is trilinear interpolated using the basis 3D LUTs and then multiplied with weighted maps to generate a coarse LDR image. To preserve local edge details and reconstruct the image from the Laplacian pyramid faithfully, we propose an image-adaptive learnable local Laplacian filter (LLF) to refine the high-frequency components while minimizing the use of computationally expensive convolution in the high-resolution components for efficiency.

πŸ›„πŸ›„ Disclaimer πŸ›„πŸ›„

"The disparities observed between the results of CLUT in our study and the original research can be attributed to differences in the fundamental tasks. Specifically, our study focuses on the transformation of 16-bit High Dynamic Range (HDR) images into 8-bit Low Dynamic Range (LDR) images. In contrast, the original paper primarily addressed 8-bit to 8-bit image enhancement. Furthermore, CLUT's parameter count stands at 952K in our paper, a result of the utilization of sLUT as the backbone for CLUT. Notably, when the backbone is modified to LUT, the parameter count is reduced to 292K."

🌟 Structure

The model architecture of LLF-LUT is shown below. Given an input 16-bit HDR image, we initially decompose it into an adaptive Laplacian pyramid, resulting in a collection of high-frequency components and a low-frequency image. The adaptive Laplacian pyramid employs a dynamic adjustment of the decomposition levels to match the resolution of the input image. This adaptive process ensures that the low-frequency image achieves a proximity of approximately 64 Γ— 64 resolution. The described decomposition process possesses invertibility, allowing the original image to be reconstructed by incremental operations.

image

:bookmark_tabs:Intallation

Download the HDR+ dataset and MIT-Adobe FiveK dataset at the following links:

HDR+ (Original Size (4K)): download (37 GB) Baiduyun(code:vcha); (480p)download (1.38 GB)

MIT-Adobe FiveK (Original Size (4K)): download (50 GB) Baiduyun(code:a9av); (480p)download (12.51 GB)

:car:Run

The code and the checkpoints will release soon.

:book: Citation

If you find our LLF-LUT model useful for you, please consider citing :mega:

@misc{zhang2023lookup,
      title={Lookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Mapping}, 
      author={Feng Zhang and Ming Tian and Zhiqiang Li and Bin Xu and Qingbo Lu and Changxin Gao and Nong Sang},
      year={2023},
      eprint={2310.17190},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

:email:Contact

If you have any question, feel free to email fengzhangaia@hust.edu.cn.