XPixelGroup / DepictQA

DepictQA: Depicted Image Quality Assessment with Vision Language Models
Apache License 2.0
60 stars 3 forks source link

DepictQA: Depicted Image Quality Assessment with Vision Language Models

🌏 Project Page • 🤗 Demo (coming) • 📀 Datasets ( huggingface / modelscope )

Official pytorch implementation of the papers:

Update

📆 [Coming soon] Online demo.

📆 [2024.07] DepictQA datasets were released in huggingface / modelscope.

📆 [2024.07] DepictQA-v1 was accepted to ECCV 2024.

📆 [2024.05] We released DepictQA-Wild (DepictQA-v2): a multi-functional in-the-wild descriptive image quality assessment model.

📆 [2023.12] We released DepictQA-v1, a multi-modal image quality assessment model based on vision language models.

Installation

Models

Training Data Tune Hugging Face Description
DQ-495K + Q-Instruct LORA download Trained on DQ-495K and Q-Instruct (see paper) datasets. Able to complete multiple-choice, yes-or-no, what, how questions, but degrades in assessing and comparison tasks.
DQ-495K + Q-Pathway LORA download Trained on DQ-495K and Q-Pathway (see paper) datasets. Performs well on real images, but degrades in comparison tasks.
DQ-495K LORA download Trained on DQ-495K dataset. Used in our paper.

Demos

Online Demo

We provide an online demo (coming soon) deployed on huggingface spaces.

Gradio Demo

We provide a gradio demo for local test.

You can revise the server config in serve.yaml. The url of deployed demo will be http://{serve.gradio.host}:{serve.gradio.port}. The default url is http://0.0.0.0:12345 if you do not revise serve.yaml.

Note that multiple workers can be launched simultaneously. For each worker, serve.worker.host, serve.worker.port, serve.worker.worker_url, and serve.worker.model_name should be unique.

Datasets

Training

Inference

Inference on Our Benchmark

Inference on Custom Dataset

Evaluation

Acknowledgement

This repository is based on LAMM. Thanks for this awesome work.

BibTeX

If you find our work useful for your research and applications, please cite using the BibTeX:

@article{depictqa_v2,
    title={Descriptive Image Quality Assessment in the Wild},
    author={You, Zhiyuan and Gu, Jinjin and Li, Zheyuan and Cai, Xin and Zhu, Kaiwen and Dong, Chao and Xue, Tianfan},
    journal={arXiv preprint arXiv:2405.18842},
    year={2024}
}

@article{depictqa_v1,
    title={Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models},
    author={You, Zhiyuan and Li, Zheyuan and Gu, Jinjin and Yin, Zhenfei and Xue, Tianfan and Dong, Chao},
    journal={arXiv preprint arXiv:2312.08962},
    year={2023}
}