vcheplygina / crowdskin

Skin lesion annotations via crowdsourcing
0 stars 0 forks source link

Create data-clean from data-raw #1

Open vcheplygina opened 4 years ago

vcheplygina commented 4 years ago

Create data-clean folder, with CSV (instead of XLSX) files with the annotations, and (if applicable) separate files documenting the type of features used by each group.

mphjoosten commented 4 years ago

All excel files were converted to csv format. Furthermore every folder contains a file called data_types with data/feature type for each column. I also wrote some code to load the data in one single dataframe with the columns: ['ID', 'group_number', 'year', 'annotator', 'orig_column', 'data_type', 'data'].

This is a list of all data types (these can be changed in the data_type file): ['Asymmetry', 'Border', 'Color', 'Color_Categorised', 'Dermo', 'Blood', 'Blue', 'Color_Fade', 'Compactness', 'Erythema', 'Red', 'Black', 'Skin_Color', 'Irregular_pigmentation', 'Flaking', 'Rough_Surface', 'Irregularity', 'White', 'Diameter', 'Yellow', 'Brown']

vcheplygina commented 4 years ago

Great! If you have time (less urgent), please create a short description of each feature based on your understanding, I can supply the reports of the students for this also. The text does not need to be neat, it is just that we can find back the information.