swharden / pyABF

pyABF is a Python package for reading electrophysiology data from Axon Binary Format (ABF) files
https://swharden.com/pyabf
MIT License
102 stars 33 forks source link

ABF with variable length sweeps: array size error #140

Closed swharden closed 1 month ago

swharden commented 1 month ago

reported in an email from Verjinia

I am using .pyabf quite a lot ion my analysis but I recently encountered a problem. I am not sure why but I am unable to load some of my files. There is some bug happening. I am able to load the file with Neo, for example.

Attached is the error message that I get and the file with which it occurs. Could you solve that? Why does it happen? Thank you.

I am using Python 3.11.9 and pyabf 2.3.8.

    "name": "ValueError",
    "message": "cannot reshape array of size 29997952 into shape (5999590,5)",
    "stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 funcs_plotting_raw_traces.plt_trace_select_swps_cut_parts(fn = fn, chan = chan, swps_keep = swps_keep, \\
      2                                                         start_point_cut = 0, end_point_cut = first_point_keep)

File ~/laptop_D_17.01.2022/Schmitz_lab/code/Human-slice-scripts/src/ephys_analysis/funcs_plotting_raw_traces.py:766, in plt_trace_select_swps_cut_parts(fn, chan, swps_keep, start_point_cut, end_point_cut)
    759 def plt_trace_select_swps_cut_parts (fn, chan, swps_keep = 'all', start_point_cut = False, end_point_cut = False):
    760     ''' 
    761     arguemnts : fn - filename, chan - channel number from 1 to 8
    762     visualization of recordings with selection of channels and parts to cut
    763     accepts int for channel
    764     start_point_cut, end_point_cut - int, what aprt to cut from the trace
    765     '''
--> 766     trace = pyabf.ABF(fn)
    767     # fixing the naming
    768     if '_Ipatch' in trace.adcNames:

File ~/opt/anaconda3/envs/miniML_clean/lib/python3.11/site-packages/pyabf/abf.py:127, in ABF.__init__(self, abfFilePath, loadData, cacheStimulusFiles, stimulusFileFolder)
    125 # optionally load data from disk
    126 if self._preLoadData:
--> 127     self._loadAndScaleData(fb)
    128     self.setSweep(0)

File ~/opt/anaconda3/envs/miniML_clean/lib/python3.11/site-packages/pyabf/abf.py:474, in ABF._loadAndScaleData(self, fb)
    472 nRows = self.channelCount
    473 nCols = int(self.dataPointCount/self.channelCount)
--> 474 raw = np.reshape(raw, (nCols, nRows))
    475 raw = np.transpose(raw)
    477 # if data is int, scale it to float32 so we can scale it

File ~/opt/anaconda3/envs/miniML_clean/lib/python3.11/site-packages/numpy/_core/fromnumeric.py:328, in reshape(a, shape, order, newshape, copy)
    326 if copy is not None:
    327     return _wrapfunc(a, 'reshape', shape, order=order, copy=copy)
--> 328 return _wrapfunc(a, 'reshape', shape, order=order)

File ~/opt/anaconda3/envs/miniML_clean/lib/python3.11/site-packages/numpy/_core/fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds)
     54     return _wrapit(obj, method, *args, **kwds)
     56 try:
---> 57     return bound(*args, **kwds)
     58 except TypeError:
     59     # A TypeError occurs if the object does have such a method in its
     60     # class, but its signature is not identical to that of NumPy's. This
   (...)
     64     # Call _wrapit from within the except clause to ensure a potential
     65     # exception has a traceback chain.
     66     return _wrapit(obj, method, *args, **kwds)

ValueError: cannot reshape array of size 29997952 into shape (5999590,5)"
swharden commented 1 month ago

This seems to be some type of off-by-one error.

Quick fix

Load the ABF in ClampFit and save it as a new file with the "save as type" set to "floating point", then load that with pyabf

Additional information

This may be helpful for implementing a true fix

header variable broken ABF saved ABF
nDataFormat 0 (int) 1 (float)
lEpisodesPerRun 30 29
lNumSamplesPerEpisode 1000000 999995
nWaveformSource all 1s all 0s
SynchArraySection.lLength all 1000000 all 999995
swharden commented 1 month ago

It looks like the number of points per sweep is sourced here

https://github.com/swharden/pyABF/blob/2a0c1b0221428e21c961664b796d953eaae7746c/src/pyabf/abf.py#L344

And for the file in question, the value returned is different.

print(abf1._dataSection._entryCount) # 29997952 - file that wont load
print(abf2._dataSection._entryCount) # 28999855 - file that will load

When lNumSamplesPerEpisode times SweepCount equals dataPointCount things are okay...

swharden commented 1 month ago

Following-up, this works:

abf = pyabf.ABF(R"24215056.abf", loadData=False)
abf.dataPointCount = abf._protocolSection.lNumSamplesPerEpisode * abf._headerV2.lActualEpisodes
for i in range(abf.sweepCount):
    abf.setSweep(i, channel=1)
    print(abf.sweepY)
swharden commented 1 month ago

I think this issue is affecting the original file because it uses variable length sweeps and may need to be pulling sweep lengths from the SynchArraySection (or not?)

verjiniaM commented 1 month ago

Hi Scott,

Thanks a lot for following up on this. Is the problem due to variable length sweeps? I tonight PClams doesn’t save files with sweeps with variable lengths, anyway. I’m confused about how this happened in the first place.