RyanWangZf / MedCLIP

EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts
394 stars 41 forks source link

The version of CheXpert #11

Closed SZUHvern closed 1 year ago

SZUHvern commented 1 year ago

Thank you for sharing the code!

Since CheXpert V1 has two released versions: [CheXpert-v1.0 Original (~ 439 g)] and [CheXpert - v1.0 Downsampled (~ 11 g)], I would like to know which one do you choose? Also, have you compared the difference of performance between them or have any suggestion about them?

Waiting for your kindly reply. Many thanks.

RyanWangZf commented 1 year ago

Hi, we used the Kaggle downsampled 12G version. We didn't try the big one.

icannistraci commented 1 year ago

Hi @RyanWangZf, I was wondering how you were able to generate the chexpert-5x200 dataset when using only the downsample version. As far as I understand, the validation split is too small to generate a dataset of 1k images and can only be created starting from the training split or the complete chexpert dataset (439gb). Thank you in advance for your help!

RyanWangZf commented 1 year ago

Hi, we follow the settings mentioned in Sec 4.1 in [1], where a subset of training data is held out and used as test set.

[1] Huang, S. C., Shen, L., Lungren, M. P., & Yeung, S. (2021). Gloria: A multimodal global-local representation learning framework for label-efficient medical image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3942-3951).