astra-vision / MaterialPalette

[CVPR 2024] Official repository of "Material Palette: Extraction of Materials from a Single Real-world Image"
MIT License
233 stars 10 forks source link
albedo computer-vision cvpr cvpr2024 generative-ai material normal roughness stable-diffusion
## Material Palette: Extraction of Materials from a Single Image (CVPR 2024)
Ivan Lopes1   Fabio Pizzati2   Raoul de Charette1
1 Inria, 2 Oxford Uni.

[![Project page](https://img.shields.io/badge/πŸš€_Project_Page-_-darkgreen?style=flat-square)](https://astra-vision.github.io/MaterialPalette/) [![paper](https://img.shields.io/badge/paper-_-darkgreen?style=flat-square)](https://github.com/astra-vision/MaterialPalette/releases/download/preprint/material_palette.pdf) [![cvf](https://img.shields.io/badge/CVF-_-darkgreen?style=flat-square)](https://openaccess.thecvf.com/content/CVPR2024/html/Lopes_Material_Palette_Extraction_of_Materials_from_a_Single_Image_CVPR_2024_paper.html) [![dataset](https://img.shields.io/badge/πŸ€—_dataset--darkgreen?style=flat-square)](https://huggingface.co/datasets/ilopes/texsd) [![star](https://img.shields.io/badge/⭐_star--darkgreen?style=flat-square)](https://github.com/astra-vision/MaterialPalette/stargazers) TL;DR, Material Palette extracts a palette of PBR materials -
albedo, normals, and roughness - from a single real-world image.

https://github.com/astra-vision/MaterialPalette/assets/30524163/44e45e58-7c7d-49a3-8b6e-ec6b99cf9c62

🚨 Todo

Overview

This is the official repository of Material Palette. In a nutshell, the method works in three stages: first, concepts are extracted from an input image based on a user-provided mask; then, those concepts are used to generate texture images; finally, the generations are decomposed into SVBRDF maps (albedo, normals, and roughness). Visit our project page or consult our paper for more details!

pipeline

Content: This repository allows the extraction of texture concepts from image and region mask sets. It also allows generation at different resolutions. Finally, it proposes a decomposition step thanks to our decomposition model, for which we share the training weights.

[!TIP] We propose a "Quick Start" section: before diving straight into the full pipeline, we share four pretrained concepts ⚑ so you can go ahead and experiment with the texture generation step of the method: see "§ Generation". Then you can try out the full method with your own image and masks = concept learning + generation + decomposition, see "§ Complete Pipeline".

1. Installation

  1. Download the source code with git

    git clone https://github.com/astra-vision/MaterialPalette.git

    The repo can also be downloaded as a zip here.

  2. Create a conda environment with the dependencies.

    conda env create --verbose -f deps.yml

    This repo was tested with Python 3.10.8, PyTorch 1.13, diffusers 0.19.3, peft 0.5, and PyTorch Lightning 1.8.3.

  3. Load the conda environment:

    conda activate matpal
  4. If you are looking to perform decomposition, download our pre-trained model and untar the archive:

    wget https://github.com/astra-vision/MaterialPalette/releases/download/weights/model.tar.gz

    This is not required if you are only looking to perform texture extraction

2. Quick start

Here are instructions to get you started using Material Palette. First, we provide some optimized concepts so you can experiment with the generation pipeline. We also show how to run the method on user-selected images and masks (concept learning + generation + decomposition)

Β§ Generation

Input image 1K 2K 4K 8K ⬇️ LoRA ~8Kb
J J J J J x
J J J J J x
J J J J J x
J J J J J x

All generations were downscaled for memory constraints.

Go ahead and download one of the above LoRA concept checkpoints, example for "blue_tiles":

wget https://github.com/astra-vision/MaterialPalette/files/14601640/blue_tiles.zip;
unzip blue_tiles.zip

To generate from a checkpoint, use the concept module either via the command line interface or the functional interface in python:

Results will be placed relative to the checkpoint directory in a outputs folder.

You have control over the following parameters:

A complete list of parameters can be viewed with python concept/infer.py --help

Β§ Complete Pipeline

We provide an example (input image with user masks used for the pipeline figure). You can download it here: mansion.zip (credits photograph: Max Rahubovskiy).

To help you get started with your own images, you should follow this simple data structure: one folder per inverted image, inside should be the input image (.jpg, .jpeg, or .png) and a subdirectory named masks containing the different region masks as .png (these must all have the same aspect ratio as the RGB image). Here is an overview of our mansion example:

β”œβ”€β”€ masks/
β”‚ β”œβ”€β”€ wood.png
β”‚ β”œβ”€β”€ grass.png
β”‚ └── stone.png
└── mansion.jpg
region mask overlay generation albedo normals roughness
#6C8EBF J J J J J J
#EDB01A J J J J J J
#AA4A44 J J J J J J

To invert and generate textures from a folder, use pipeline.py:

Under the hood, it uses two modules:

  1. concept, to extract and generate the texture (concept.crop, concept.invert, and concept.infer);
  2. capture, to perform the BRDF decomposition.

A minimal example is provided here:

[!IMPORTANT] By default, both train_text_encoder and gradient_checkpointing are set to True. Also, this implementation does not include the LPIPS filter/ranking of the generations. The code will only output a single sample per region. You may experiment with different prompts and parameters (see "Generation" section).

3. Project structure

The pipeline.py file is the entry point to run the whole pipeline on a folder containing the input image at its root and a masks/ sub-directory containing all user defined masks. The train.py file is used to train the decomposition model. The most important files are shown here:

.
β”œβ”€β”€ capture/        % Module for decomposition
β”‚ β”œβ”€β”€ callbacks/    % Lightning trainer callbacks
β”‚ β”œβ”€β”€ data/         % Dataset, subsets, Lightning datamodules
β”‚ β”œβ”€β”€ render/       % 2D physics based renderer
β”‚ β”œβ”€β”€ utils/        % Utility functions
β”‚ └── source/       % Network, loss, and LightningModule
β”‚   └── routine.py  % Training loop
β”‚
└── concept/        % Module for inversion and texture generation
  β”œβ”€β”€ crop.py       % Square crop extraction from image and masks
  β”œβ”€β”€ invert.py     % Optimization code to learn the concept S*
  └── infer.py      % Inference code to generate texture from S*

If you have any questions, post via the issues tracker or contact the corresponding author.

4. (optional) Training

We provide the pre-trained decomposition weights (see "Installation"). However, if you are looking to retrain the domain adaptive model for your own purposes, we provide the code to do so. Our method relies on the training of a multi-task network on labeled (real) and unlabeled (synthetic) images, jointly. In case you wish to retrain on the same datasets, you will have to download both the AmbientCG and TexSD datasets.

First download the PBR materials (source) dataset from AmbientCG:

python capture/data/download.py path/to/target/directory

To run the training script, use:

python train.py --config=path/to/yml/config

Additional options can be found with python train.py --help.

[!NOTE] The decomposition model allows estimating the pixel-wise BRDF maps from a single texture image input.

Acknowledgments

This research project was mainly funded by the French Agence Nationale de la Recherche (ANR) as part of project SIGHT (ANR-20-CE23-0016). Fabio Pizzati was partially funded by KAUST (Grant DFR07910). Results were obtained using HPC resources from GENCI-IDRIS (Grant 2023-AD011014389).

The repository contains code taken from PEFT, SVBRDF-Estimation, DenseMTL. As for visualization, we used DeepBump and Blender. Credit to Runway for providing us all the stable-diffusion-v1-5 model weights. All images and 3D scenes used in this work have permissive licenses. Special credits to AmbientCG for the huge work.

The authors would also like to thank all members of the Astra-Vision team for their valuable feedback.

License

If you find this code useful, please cite our paper:

@inproceedings{lopes2024material,
    author = {Lopes, Ivan and Pizzati, Fabio and de Charette, Raoul},
    title = {Material Palette: Extraction of Materials from a Single Image},
    booktitle = {CVPR},
    year = {2024},
    project = {https://astra-vision.github.io/MaterialPalette/}
}

Material Palette is released under MIT License.


💁 jump to top