richard-peng-xia / awesome-multimodal-in-medical-imaging

A collection of resources on applications of multi-modal learning in medical imaging.
MIT License
454 stars 48 forks source link
large-language-models large-multimodal-models medical-imaging medical-report-generation multimodal-deep-learning multimodal-large-language-models multimodal-learning visual-question-answering

Maintenance PR's Welcome Awesome

Awesome-Multimodal-Applications-In-Medical-Imaging

This repository includes resources on several applications of multi-modal learning in medical imaging, including papers related to large language models (LLM). Papers involving LLM are bold.

Contributing

Please feel free to send me pull requests or email to add links or to discuss with me about this area. Markdown format:

- [**Name of Conference or Journal + Year**] Paper Name. [[pdf]](link) [[code]](link)

News

Citation

@article{xia2024cares,
  title={CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models},
  author={Xia, Peng and Chen, Ze and Tian, Juanxi and Gong, Yangrui and Hou, Ruibo and Xu, Yue and Wu, Zhenbang and Fan, Zhiyuan and Zhou, Yiyang and Zhu, Kangyu and others},
  journal={arXiv preprint arXiv:2406.06007},
  year={2024}
}

@article{xia2024rule,
  title={RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models},
  author={Xia, Peng and Zhu, Kangyu and Li, Haoran and Zhu, Hongtu and Li, Yun and Li, Gang and Zhang, Linjun and Yao, Huaxiu},
  journal={arXiv preprint arXiv:2407.05131},
  year={2024}
}

Overview


Data Source

Image-Caption Datasets

dataset domain image text source language
ROCO multiple 87K 87K research papers En
MedICaT multiple 217K 217K research papers En
PMC-OA multiple 1.6M 1.6M research papers En
ChiMed-VL multiple 580K 580K research papers En/zh
FFA-IR fundus 1M 10K medical reports En/zh
PadChest cxr 160K 109K medical reports Sp
MIMIC-CXR cxr 377K 227K medical reports En
OpenPath histology 208K 208K social media En
Quilt-1M histology 1M 1M research papers
social media
En
Harvard-FairVLMed fundus 10k 10K medical reports En

Visual Question Answering Datasets

dataset domain image QA Items language
VQA-RAD radiology 315 3k En
SLAKE radiology 642 14k En/zh
Path-VQA histology 5k 32M En
VQA-Med radiology 4.5k 5.5k En
PMC-VQA multiple 149k 227k En
OmniMedVQA multiple 118k 128k En
ProbMed radiology 6k 57k En

Survey


Medical Report Generation

2018

2024


Medical Visual Question Answering

2020

2024


Medical Vision-Language Model

2022


🎉 Contribution

Contributing to this paper list

⭐" Join us in improving this repository! If you know of any important works we've missed, please contribute. Your efforts are highly valued! "

Contributors