Open stbnps opened 4 years ago
Images that seem to be white or black have data in them. Just normalize[0 - 1], multiply it by 255, and plot it or save it.
Images that seem to be white or black have data in them. Just normalize[0 - 1], multiply it by 255, and plot it or save it.
This comment is the answer to Q1 in: BIMCV-COVID19+/FAQ.md
@rahools That's not true. Take a look at image 216840111366964013590140476722013038132133659_02-059-019.png:
You can see a white line. That white line means that the image is already scaled.
@samils7 That FAQ is for BIMCV-COVID19+, not for padchest-covid
my bad, I successfully applied normalization on BIMCV-COVID19+ so I thought that would translate to padchest dataset too. Thanks for the insight @stbnps
I performed the following experiment
Achieving the following results
Specificity:
Sensibility:
The issue
The network seems to perform very well on dataset [3], where each image was manually reviewed by radiologists [4]. However it performs significantly worse on dataset [1], where most labels were extracted using NLP and the images were not reviewed (even leading to the inclusion of completely white, or completely black images [5]).
Do you think the quality of the images and annotations may be a limiting factor for the performance of the network?
References
[1] http://ceib.bioinfo.cipf.es/covid19/resized_padchest_neumo.tar.gz [2] https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia [3] https://www.kaggle.com/c/rsna-pneumonia-detection-challenge [4] https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/overview/acknowledgements [5] https://github.com/BIMCV-CSUSP/BIMCV-COVID-19/tree/master/padchest-covid#iti---proposal-for-datasets