abuzarmahmood / blech_clust

GNU General Public License v3.0
7 stars 4 forks source link

Issues with EMG data (or lack thereof) in blech_make_arrays.py #206

Open Mraymon5 opened 3 weeks ago

Mraymon5 commented 3 weeks ago

Getting some mild weird behavior from blech_make_arrays.py:

=== No trials were cutoff ===
Using durations ::: [2000, 5000]
Sorted units found ==> Making spike trains
Creating spike-trains for dig_in_13
Creating spike-trains for dig_in_9
Creating emg-trials for dig_in_13
Traceback (most recent call last):
  File "blech_make_arrays.py", line 460, in <module>
    emg_nodes,
NameError: name 'emg_nodes' is not defined
Closing remaining open files:/home/ramartin/Documents/MAR_Data/MR03/MR03_BAT_Tastes_Day6_240526_131433/MR03_BAT_Tastes_Day6_240526_131433_repacked.h5...done

The problem is that I have no EMG data, but for some reason I still have an empty group emg_nodes in my HDF5 file. So make_arrays see that's there and tries to open up the nodes, but there aren't any, which causes an error.

As far as I can tell, this doesn't actually do anything too bad; the arrays for the other parts of the HDF5 file seem to be made without issue, I think.

I have a "fix" for this: In line 437, there's a check for whether the HDF5 file contains /raw_emg. If that check clears, I added an additional check to see if there are any nodes in /raw_emg, and to produce a specific message if not.

I feel like that's just a band-aid, though, because the real issue here is that I shouldn't have /raw_emg in my HDF5 file at all, if I'm reading the logic of the scripts correctly.

Mraymon5 commented 3 weeks ago

For my own curiosity, I deleted the hf5 file for the data I'd been processing, and ran it all back from blech_clust.py, and got the same behavior, though my patch handled it.

abuzarmahmood commented 3 weeks ago

Haha, clearly we need an ephys-only dataset for testing (made issue #209). So you are right that your fix is currently a bandaid because the rest of the logic for emg processing in blech_make_arrays is messed up and your new if-logic branch just sidesteps that problem. I'll create a fix first thing tomorrow

Mraymon5 commented 1 week ago

I think I'm starting to find the roots of some of these issues:

blech_make_arrays.py checks for EMG data by searching the hdf5 file for a group called /raw_emg It does this a couple of times: line 301, line 444

I think the assumption here is that datasets that don't have EMG data shouldn't have this group in the hdf5. However, in blech_clust.py, when the hdf5 file is being created [lines 59-74], /raw_emg is created by default, and just isn't populated for non-EMG data.

I think this suggests two possible solutions; when generating the hdf5 file, blech_clust.py could check for EMG data FIRST, and only create that group if there is actually EMG. I think maybe the .info file created by blech_exp_info.py should already have that information.

Alternatively, it may be fine to create it an leave it empty, as long as that's an understood behavior that's accounted for in later checks. Later scripts could either test the /raw_emg group's length a-la my band-aid fix, or maybe look at the .info data file instead. May be a similar situation in #217