GaryGuTC / LaPA_model

[CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
7 stars 0 forks source link

[CVPR2024 Workshop (oral)] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering

This is the implementation of LaPA: Latent Prompt Assist Model For Medical Visual Question Answering. [CVPR Workshop] [Paper] [Video]




conda env create -f environment.yaml # method 1
pip install -r requirements.txt # method 2


├── checkpoints
├── data
│   ├── vqa_medvqa_2019_test.arrow
│   ├── ......
├── download
│   ├── checkpoints
│   ├── external_data
│   ├── pretrained
│   │   ├── m3ae.ckpt
│   ├── roberta-base
├── m3ae
├── prepro
├── run_scripts


Please follow here and only use the SLAKE, VQA RAD, and MedVQA 2019 datasets.

External data

Download the external_data and put it in the download/external_data.


Download the m3ae pretrained weight and put it in the download/pretrained.


Download the roberta-base and put it in the download/roberta-base.


Download the checkpoints we trained and put it in the download/checkpoints.

Train & Test

# cd this file 
bash run_scripts/
# cd this file
bash run_scripts/


Method Venue VQA-RAD SLAKE VQA-2019
Open Closed Overall Open Closed Overall Overall
BAN NeurIPS18 37.40 72.10 58.30 74.60 79.10 76.30 -
CPRD-BAN MICCAI21 52.50 77.90 67.80 79.50 83.40 80.10 -
MMBERT ISBI21 63.10 77.90 72.00 - - - 67.20
M3AE MICCAI22 64.80 82.72 75.61 79.22 85.10 81.53 78.40
M2I2 ISBI22 61.80 81.60 73.70 74.70 91.10 81.20 -
ARL MM22 65.10 85.96 77.55 79.70 89.30 84.10 79.80
PubMedCLIP EACL23 60.10 80.00 72.10 78.40 82.50 80.10 -
CPCR TMI23 60.50 80.40 72.50 80.50 84.10 81.90 -
LaPA Ours 68.72 86.40 79.38 82.17 88.70 84.73 81.60


If you find this project useful in your research, please cite the following papers:

    author    = {Tiancheng Gu and Kaicheng Yang and Dongnan Liu and Weidong Cai},
    title     = {LaPA: Latent Prompt Assist Model For Medical Visual Question Answering},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    year      = {2024}


Our project references the codes in the following repos. Thanks for their works and sharing.