lindawangg / COVID-Net

COVID-Net Open Source Initiative
Other
1.15k stars 480 forks source link

Duplicate Images #70

Closed rsk2327 closed 4 years ago

rsk2327 commented 4 years ago

Description

The Kaggle dataset that has been included has multiple sources. SIRM is just one of them. One of the other sources is the ieee8023 dataset. I see that this dataset has again been included as a separate dataset of its own.

Have you ensured that these samples have not been duplicated in the final dataset? I wasnt able to find any code that checked for this duplication.