lindawangg / COVID-Net

COVID-Net Open Source Initiative
Other
1.15k stars 480 forks source link

Covidx6 script #110

Closed NaomiTerhljan closed 3 years ago

NaomiTerhljan commented 3 years ago

Pull Request Template

Description

Creates a covidx6 .ipynb for generating train and test datasets for binary classification. I've also updated README documentation on the main page and dataset page.

Alex's message from slack with the requirements for the script is as follows:

Create a new v6 data curation script that generates a new training dataset such that: 1) we now include the ActuaMed covid-negative data as well as the Cohen covid-negative data (i.e., anything that isn't COVID-positive) alongside the RSNA data under COVID-19 negative, 2) have just two labels (covid-negative and covid-positive), 3) have the test set containing the same 100 covid-positive cases but have the negative test set containing a random selection of 10 from the existing normal cases in current test set, 70 from the existing pneumonia cases in current test set, and random selection of 20 negative cases from Actuamed.