The performance of zero-shot classification result on COVID-19 and RSNA didn‘t achieve the desired results

RyanWangZf / MedCLIP

EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts

436 stars 49 forks source link

The performance of zero-shot classification result on COVID-19 and RSNA didn‘t achieve the desired results #19

Closed wyh196646 closed 1 year ago

wyh196646 commented 1 year ago

I just reproduce the MedCLIP with MIMIC and chexpert（training dataset), on Chexpert-5*200, the results demonstrate the ideal results. But whe I do zero-shot classification on COVID-19 and RSNA, the ACC of two dataset is only 0.45 and 0.43. I think there is something wrong with my data segmentation or data processing. Can you supply the data processing script and data split ? Really thanks !

RyanWangZf commented 1 year ago

hi, can you provide your setup details for loading these two datasets? I suppose the input image processing will impact the results a lot.

wyh196646 commented 1 year ago

Really appreciate for your reply, for covid-19, I download the dataset from the kaggle, the link is https://www.kaggle.com/datasets/cf77495622971312010dd5934ee91f07ccbcfdea8e2f7778977ea8485c1914df and then I use the Test set of this dataset, with only keep the normal and COVID-19 and remove the non-covid pneumonia. I went straight to zero-shot classification without any preprocessing.

For RSNA, I download from here https://www.kaggle.com/datasets/sovitrath/rsna-pneumonia-detection-2018 and sample the pneumonia and normal 1:1 2000 cases for the test.

RyanWangZf commented 1 year ago

hi, one possible cause may be you used wrong prompts:

    cls_prompts = generate_class_prompts(df_sent, ['No Finding'], n=10)
    covid_prompts = generate_covid_class_prompts(n=10)
    cls_prompts.update(covid_prompts)

can you try different prompts and and then retry?

xhjy2020 commented 8 months ago

Really appreciate for your reply, for covid-19, I download the dataset from the kaggle, the link is https://www.kaggle.com/datasets/cf77495622971312010dd5934ee91f07ccbcfdea8e2f7778977ea8485c1914df and then I use the Test set of this dataset, with only keep the normal and COVID-19 and remove the non-covid pneumonia. I went straight to zero-shot classification without any preprocessing.

For RSNA, I download from here https://www.kaggle.com/datasets/sovitrath/rsna-pneumonia-detection-2018 and sample the pneumonia and normal 1:1 2000 cases for the test.

@wyh196646 Hello, according to the covid download link you provided, the test data set is COVID:NORMAL=2395:2140. The paper says that the test set is 3000 images. Isn’t this the data set in the paper?

deepankarvarma commented 7 months ago

I just reproduce the MedCLIP with MIMIC and chexpert（training dataset), on Chexpert-5*200, the results demonstrate the ideal results. But whe I do zero-shot classification on COVID-19 and RSNA, the ACC of two dataset is only 0.45 and 0.43. I think there is something wrong with my data segmentation or data processing. Can you supply the data processing script and data split ? Really thanks !

Can you please share the zero shot classification code