NickThorne123 / pydemo

Demonstration Python Project (FMRI)
0 stars 1 forks source link

Extract Transform Load Matlab data #9

Open NickThorne123 opened 1 year ago

NickThorne123 commented 1 year ago

The data files in Participant Data are tricky to work with in Python requiring unclear expressions to access the data, eg age = data['data'][0,0]['individual'][0,0]['age'][0][0]

In order to resolve this, we have (Issue written retrospectively) created a Mat_data Python class mat_data.py.

And an class_open_mat.py script that opens all the data files and aggregates them into an array of Mat_data classes that are then saved to a pickle file called 'agg_data.pkl'. So we now have all our data in a much easier to access form (about 250Kb on disk).

We then demonstrate using this data with the script process_agg.py that opens the data and generates the age histogram.

I dug into h5py files a bit more, and they're designed for multi terrabyte data. In which case we wouldn't (couldn't) do this for performance reasons. But in this particular case the data is small enough that it makes sense.

NickThorne123 commented 1 year ago

ToDo

charbiso commented 1 year ago

Just got around to looking at this script! It looks great and I'll give it a go at some stage. For now though, I just converted all the data I needed from .mat to csv within Matlab rather than messing around with the different file types