Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting (submitted to TMI-2023)

The Pytorch Implementation of Token-Mixer.

Introduction

In this project, we use Ubuntu 16.04.5, Python 3.7, Pytorch 1.8.1 and four NVIDIA RTX 2080Ti GPU.

The medical image report generation datasets are available at the following links:

To train the model, you need to prepare the training dataset. For example, the MIMIC-CXR-JPG data.

Check the dataset path in train.py, and then run:

python train.py

Check the model and data path in test.py, and then run:

python test.py

the paraphrase-en.gz should be put into the .\pycocoevalcap\meteor\data, since the file is too big to upload.