Omkar Thawakar , Abdelrahman Shaker , Sahal Shaji Mullappilly* , Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, and Fahad Shahbaz Khan.
*Equal Contribution
Mohamed bin Zayed University of Artificial Intelligence, UAE
You can try our demo using the provided examples or by uploading your own X-ray here : Link-1 | Link-2 | Link-3 .
1. Prepare the code and the environment
Clone the repository and create a anaconda environment
git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda env create -f env.yml
conda activate xraygpt
OR
git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda create -n xraygpt python=3.9
conda activate xraygpt
pip install -r xraygpt_requirements.txt
1. Prepare the Datasets for training
Refer the dataset_creation for more details.
Download the preprocessed annoatations mimic & openi. Respective image folders contains the images from the dataset.
Following will be the final dataset folder structure:
dataset
├── mimic
| ├── image
| | ├──abea5eb9-b7c32823-3a14c5ca-77868030-69c83139.jpg
| | ├──427446c1-881f5cce-85191ce1-91a58ba9-0a57d3f5.jpg
| | .....
| ├──filter_cap.json
├── openi
| ├── image
| | ├──1.jpg
| | ├──2.jpg
| | .....
| ├──filter_cap.json
...
3. Prepare the pretrained Vicuna weights
We built XrayGPT on the v1 versoin of Vicuna-7B. We finetuned Vicuna using curated radiology report samples. Download the Vicuna weights from vicuna_weights The final weights would be in a single folder in a structure similar to the following:
vicuna_weights
├── config.json
├── generation_config.json
├── pytorch_model.bin.index.json
├── pytorch_model-00001-of-00003.bin
...
Then, set the path to the vicuna weight in the model config file "xraygpt/configs/models/xraygpt.yaml" at Line 16.
To finetune Vicuna on radiology samples please download our curated radiology and medical_healthcare conversational samples and refer the original Vicuna repo for finetune.Vicuna_Finetune
4. Download the pretrained Minigpt-4 checkpoint
Download the pretrained minigpt-4 checkpoints. ckpt
A. First mimic pretraining stage
In the first pretrained stage, the model is trained using image-text pairs from preprocessed mimic dataset.
To launch the first stage training, run the following command. In our experiments, we use 4 AMD MI250X GPUs.
torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_mimic_pretrain.yaml
2. Second openi finetuning stage
In the second stage, we use a small high quality image-text pair openi dataset preprocessed by us.
Run the following command. In our experiments, we use AMD MI250X GPU.
torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_openi_finetune.yaml
Download the pretrained xraygpt checkpoints. link
Add this ckpt in "eval_configs/xraygpt_eval.yaml".
Try gradio demo.py on your local machine with following
python demo.py --cfg-path eval_configs/xraygpt_eval.yaml --gpu-id 0
If you're using XrayGPT in your research or applications, please cite using this BibTeX:
@article{Omkar2023XrayGPT,
title={XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models},
author={Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen and Fahad Shahbaz Khan},
journal={arXiv: 2306.07971},
year={2023}
}
This repository is licensed under CC BY-NC-SA. Please refer to the license terms here.