Closed andyzzhao closed 3 years ago
@andyzzhao this looks good, but we might hold off on merging until we're ready to release this along with the BIMCV data. We'll discuss at tomorrow's meeting.
@mayaliliya in the v8 label files, have we fixed the issues related to inconsistent labels that we had in the v7 dataset (e.g., #126)?
@mayaliliya in the v8 label files, have we fixed the issues related to inconsistent labels that we had in the v7 dataset (e.g., #126)?
I fixed it in the data.py script with the brute force pop method. I am thinking we just merge this and then I will do a separate PR to address this bug next week as well as thoroughly addressing the duplicate issue (i.e. seeing if we can scavenge more images rather than removing all images with the same url base).
Pull Request Template
Description
New workflow for creating COVIDx dataset:
Context of change
Please add options that are relevant and mark any boxes that apply.
Type of change
Please mark any boxes that apply.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.
create_COVIDx.ipynb and create_COVIDx_binary.ipynb were ran to confirm 200 RICORD images were added to test and the rest were added to train.
Checklist:
Please mark any boxes that have been completed.