Excel to csv - Githubissues

thenineteen / Semiology-Visualisation-Tool

Data driven 3D brain visualisation of semiology. Semiology to anatomy translator based on over 4600 patients from 309 peer-reviewed articles.

MIT License

9 stars 6 forks source link

Excel to csv #194

Open thenineteen opened 4 years ago

thenineteen commented 4 years ago

convert data into human readable format:

Semio2Brain database SemioDict mappings


TEST MAPPINGS:
for k in dummy_map_df_dict.keys():
    dummy_map_df_dict[k].to_csv(r'D:\\'+k+r'.csv')

dummy_map_df_dict = {}
for gif_sheet in gif_sheet_names:
    dummy_map_df_dict[str(gif_sheet)] = pd.read_csv(r'D:\\'+str(gif_sheet)+r'.csv')
    dummy_map_df_dict[str(gif_sheet)] = dummy_map_df_dict[str(gif_sheet)].loc[:, [col for col in dummy_map_df_dict[str(gif_sheet)] if 'Unnamed' not in col]]

#(((())))
#NB dummy_map_df_dict['GIF TL'].dropna(how='all', axis=1).drop(columns=['Unnamed: 0', 'Unnamed: 1']).equals(bobo['GIF TL'])
#True
#(((())))

MEGA ANALYSIS
df_read = pd.read_csv(excel_data)
df = df_read.loc[:, [col for col in df_read if 'Unnamed' not in col]]

thenineteen commented 4 years ago

it's currently failing as the read_csv of dummy data for Aphasia is picking up one of the two semiologies mapping to FL. We know this as the new_All_combined_gifs in the test is givne a vlue of 1 for GIFs such as 105 which have no other datapoints in the dummy.

The postictal aphasia doesn't have prior label and so for two reasons is probably excluded. The "1 FL aphasia" dummy semio is probably being picked up despite not having a prior label?

thenineteen commented 4 years ago

the point was to make this human readable and easier to follow changes to database and mappings

but this is a minor change in dummy database:

fepegar commented 4 years ago

the point was to make this human readable and easier to follow changes to database and mappings

Also the repo size is getting quite large because of all the binary diffs of the Excel files, I think.

but this is a minor change in dummy database:

What is the change? It looks like all lines have been edited.

thenineteen commented 4 years ago

That's the issue: it was a single small change, but instead looks like the entire csv file was edited, which makes it redundant?

I wasn't aware repo size is large

fepegar commented 4 years ago

That's the issue: it was a single small change, but instead looks like the entire csv file was edited, which makes it redundant?

It looks like it wasn't that small: you changed all the lines.

I wasn't aware repo size is large

30 MB without vs ~160 MB with the .git folder, where diffs are stored.

thenineteen commented 4 years ago

I only changed at most a few "cells" in the csv