qxiaobu / FLANNEL

3 stars 4 forks source link

get_covid_data_dict.py - Incorrect Classification Logic #3

Open beyerch opened 3 years ago

beyerch commented 3 years ago

This code creates a series of lists whose elements are later compared to the finding column from the metadata.csv file.

image

Unfortunately the logic does not work, which I noticed when it reported 0 COVID images.

image

The issue is that the information from the metadata.csv file is not a single item which neatly aligns with a member of the list, rather it is a hierarchy from least to most specific. For example, COVID-19 isn't referenced as COVID-19, it is: 'Pneumonia/Viral/COVID-19'. Because of this, the logic fails.

The "easy" fix is to just split the string by / and use the right-most entry.

image

After this update, x0 - x4 appear to give meaningful results

image

qxiaobu commented 3 years ago

Please refer to the metadata.csv in FLANNEL/original data/

beyerch commented 3 years ago

Definitely looks like the format has changed. Will submit with updated versioning for more recent dataset

image