Inconsistency between generated test data and test labels

lindawangg / COVID-Net

COVID-Net Open Source Initiative

Other

1.15k stars 480 forks source link

Inconsistency between generated test data and test labels #125

Closed chododom closed 3 years ago

chododom commented 3 years ago

When the script for generating binary COVID-19 classification data is used, the generated test set is created with 1599 images. The file 'test_COVIDx7B.txt' which contains the labels for these images only has 200 lines though.

Steps taken to reproduce: 1) Downloaded all 5 datasets from given repositories and links 2) Ran script create_COVIDx_binary.ipynb with altered paths 3) local data/test contains 1599 images, whilst remote file labels/test_COVIDx7B.txt contains 200 image labels

chododom commented 3 years ago

I have realized the generated data is also probably meant for the non-binary classification task and the text file with labels is meant to be used to filter out the relevant ones for the specific task at hand, I apologize.