NJUVISION / rho-vision

ρ-Vision: Efficient Visual Computing with Camera RAW Snapshots
https://njuvision.github.io/rho-vision
MIT License
39 stars 3 forks source link

[T-PAMI 2024] Efficient Visual Computing with Camera RAW Snapshots

We proposes a novel $\bf{\rho}$-Vision to directly perform high-level semantic understanding and low-level compression using RAW images. The framework is demonstrated to provide better detection accuracy and compression than RGB-domain counterparts and is shown to be able to generalize across different camera sensors and task-specific models. Additionally, it has the potential to reduce ISP computation and processing time.

In this repo, we release our Unpaired CycleR2R code and pretrained models. With Unpaired CycleR2R, you could train your RAW model with diversity and realistic simulated RAW images and then deploy them in the real-world directly.

Requirments

pip install -r requirements.txt

Datasets

(required) Download the MulitRAW LUCID subset (passwd: x2un).

(optional) Download the BDD100K.

(optional) Download the Cityscapes.

(optional) Download the Flicker2W.

The datasets folder will be like:

datasets
 ├─multiRAW
 │  ├─iphone_xsmax
 │  ├─huawei_p30pro
 │  ├─asi_294mcpro
 │  └─oneplus_5t
 ├─(optional) bdd100k
 ├─(optional) cityscapes
 └─(optional) flicker

Pretrained Models

Source RGB Target RAW Model
BDD100K iPhone XSmax link
BDD100K Huawei P30pro link
BDD100K asi 294mcpro link
BDD100K Oneplus 5t link
Cityscapes iPhone XSmax link
Flicker2W iPhone XSmax link
Flicker2W Huawei P30pro link
Flicker2W asi 294mcpro link
Flicker2W Oneplus 5t link

Training

python train.py configs/unpaired_cycler2r/unpaired_cycler2r_in_bdd100k_rgb2iphone_raw_20k.py

Inference

Please download the pretrained model first.

You coud inference using command,

python inference.py --ckpt bdd100k_rgb_to_iphone_raw.pth --rgb resources/bdd100k.jpg

or in your code

from inference import DemoModel
ckpt_path = 'bdd100k_rgb_to_iphone_raw.pth'
rgb_path = 'resources/bdd100k.jpg'

model = DemoModel(ckpt_path)
rgb = imread(rgb_path).astype(np.float32) / 255
rgb = torch.from_numpy(rgb).permute(2, 0, 1)[None]

model = model.cuda()
rgb = rgb.cuda()

raw = model(rgb, mosaic=False)

Citation

If your find our dataset and work are helpful for your research, please cite our paper.

@article{li2022efficient, 
    title={Efficient Visual Computing with Camera RAW Snapshots},
    author={Zhihao Li, Ming Lu, Xu Zhang, Xin Feng, M. Salman Asif, and Zhan Ma},
    journal={arxiv}, 
    url={https://arxiv.org/pdf/2212.07778.pdf}, 
    year={2022}, 
}