lindawangg / COVID-Net

COVID-Net Open Source Initiative
Other
1.15k stars 482 forks source link

Datasets #109

Open baranaldemir opened 3 years ago

baranaldemir commented 3 years ago

I have 3 questions if you don't mind.

  1. https://www.kaggle.com/tawsifurrahman/covid19-radiography-database this dataset doesn't have any metadata file right now can you please provide it?
  2. As far as I understand RSNA dataset has some pneumonia duplicates. I might be wrong but I think you didn't notice that the stage_2_train_labels.csv file has duplicate patient Ids. Am I right?
  3. Is there any CSV files for the RSNA test set too?
ZaraNaSha commented 3 years ago

Dear all, I have the same problem as mentioned above and could not split the data with the jupyter notebook file.

VedantWani commented 3 years ago

I see that the https://www.kaggle.com/tawsifurrahman/covid19-radiography-database has been updated to version 3. The images have also changed with the new version and there will be duplicates if the script for Covidx5 is used.

Not able to download version 1 of the dataset which this dataset uses, the dataset scripts should be updated

For those who are using version 2 and version 3 of the covid19-radiography-database, your result may vary.

haydengunraj commented 3 years ago

Hi everyone, we've updated the scripts to address the changes in the data sources. As a result, previous versions of the dataset notebooks may not work correctly with the current versions of the various data sources. To fix this, you can modify the old notebooks to accommodate the changes, or use previous versions of the source datasets ensure compatibility.

VedantWani commented 3 years ago

Hi everyone, we've updated the scripts to address the changes in the data sources. As a result, previous versions of the dataset notebooks may not work correctly with the current versions of the various data sources. To fix this, you can modify the old notebooks to accommodate the changes, or use previous versions of the source datasets ensure compatibility.

Hi, I have two questions.

  1. The dataset source version of https://www.kaggle.com/tawsifurrahman/covid19-radiography-database which is currently available has different filenames and different images source from various references. If you look at the COVID (1).png of the current version is different from COVID-19 (1).png from version 1 of the same dataset. This means COvidx7 is not compatible with earlier versions, right?

  2. The metadata of the above-mentioned dataset contains a slightly different URL compared to version 1. Also, the script to create the dataset fails to address the duplicate image from the cohen dataset and the above Kaggle dataset (due to URL not matching). Are there duplicate images?

PS: I have downloaded version 1 of the https://www.kaggle.com/tawsifurrahman/covid19-radiography-database (had to manually download one image at a time) also created the complete dataset from the earlier script.

GliozzoJ commented 3 years ago

Hi all,

I have a question. How can I download the version 1 of the dataset COVID-19 Radiography Database ? It seems to me that it is impossible from kaggle API and I cannot download it from the webpage:

https://www.kaggle.com/tawsifurrahman/covid19-radiography-database/version/1

@VedantWani How did you download the version 1 of the dataset? I couldn't even download a single image since every time I get the message "404 We can't find that page".

VedantWani commented 3 years ago

Hi all,

I have a question. How can I download the version 1 of the dataset COVID-19 Radiography Database ? It seems to me that it is impossible from kaggle API and I cannot download it from the webpage:

https://www.kaggle.com/tawsifurrahman/covid19-radiography-database/version/1

@VedantWani How did you download the version 1 of the dataset? I couldn't even download a single image since every time I get the message "404 We can't find that page".

@GliozzoJ The only way I found to download required the covid-19 images is using data explorer, open COVID-19 directory, then click on the image, once the image opens, right-click on the image, and finally click save image as. For each image.

GliozzoJ commented 3 years ago

Hi all, I have a question. How can I download the version 1 of the dataset COVID-19 Radiography Database ? It seems to me that it is impossible from kaggle API and I cannot download it from the webpage: https://www.kaggle.com/tawsifurrahman/covid19-radiography-database/version/1 @VedantWani How did you download the version 1 of the dataset? I couldn't even download a single image since every time I get the message "404 We can't find that page".

@GliozzoJ The only way I found to download required the covid-19 images is using data explorer, open COVID-19 directory, then click on the image, once the image opens, right-click on the image, and finally click save image as. For each image.

@VedantWani Thank you for your reply. Do you know also how to download the file COVID-19.metadata.xlsx ?

VedantWani commented 3 years ago

Hi all, I have a question. How can I download the version 1 of the dataset COVID-19 Radiography Database ? It seems to me that it is impossible from kaggle API and I cannot download it from the webpage: https://www.kaggle.com/tawsifurrahman/covid19-radiography-database/version/1 @VedantWani How did you download the version 1 of the dataset? I couldn't even download a single image since every time I get the message "404 We can't find that page".

@GliozzoJ The only way I found to download required the covid-19 images is using data explorer, open COVID-19 directory, then click on the image, once the image opens, right-click on the image, and finally click save image as. For each image.

@VedantWani Thank you for your reply. Do you know also how to download the file COVID-19.metadata.xlsx ?

@GliozzoJ COVID-19.metadata.xlsx