Closed Nlin2 closed 3 years ago
Thanks for catching this issue!
In order to replicate the experiments in our paper, I think the easiest solution is to use the exact version of the Github-COVID dataset that was available when we were running experiments. I've updated the README to reflect that after cloning the Github-COVID repo, you can use the command git checkout 9b9c2d5
to get the version of this dataset where patients with COVID are still labeled as either 'COVID-19' or 'COVID-19, ARDS'.
Error Replication Running
python train_models.py --dataset 1
gives the following ErrorProblem Identification Looking at the metadata, we see that values in finding columns may have been updated to new values. Github-COVID feature engineering datasets/githubcovid.py needs to be updated Current solution gives false for every datapoint, because of line 71:
covid_set = ['COVID-19','COVID-19, ARDS']
Solution Patients w/ COVID now have the following string 'Pneumonia/Viral/COVID-19' instead of 'COVID-19','COVID-19, ARDS'] Pneumonia patients and healthy set must also be updated to correspond with the new changes