flatironinstitute / dendro-old

Analyze neuroscience data in the cloud
https://flatironinstitute.github.io/dendro-docs/
Apache License 2.0
19 stars 2 forks source link

Error when runing MS5 with embargoed data #32

Closed luiztauffer closed 1 year ago

luiztauffer commented 1 year ago

I'm not 100% sure this is a protocaas error, but it might be related to signed url expiring, since this is embargoed data. The sorter process seems to have completed fine, and the error happened when writing to nwb.

Error track (cropped):

Starting mountainsort5 processor
Opening remote input file
:::::::::::::::::::: PROCESSOR ELAPSED TIME: 1.869 s
Creating input recording
:::::::::::::::::::: PROCESSOR ELAPSED TIME: 3.021 s
Filtering on
Whitening on
:::::::::::::::::::: PROCESSOR ELAPSED TIME: 99.340 s
Creating binary recording

write_binary_recording: 100%|##########| 301/301 [09:37<00:00,  1.92s/it]
/usr/local/lib/python3.11/site-packages/spikeinterface/core/binaryrecordingextractor.py:77: UserWarning: `num_chan` is to be deprecated in version 0.100, please use `num_channels` instead
  warnings.warn("`num_chan` is to be deprecated in version 0.100, please use `num_channels` instead")
:::::::::::::::::::: PROCESSOR ELAPSED TIME: 676.636 s
Setting up sorting parameters
Sorting scheme 1
Number of channels: 384
Number of timepoints: 9000043
Sampling frequency: 30000.145251 Hz
Channel 0: [16.  0.]
Channel 1: [48.  0.]
...
...
Channel 383: [  32. 3820.]
Loading traces
*** MS5 Elapsed time for load_traces: 0.000 seconds ***
Detecting spikes

Adjacency for detect spikes with channel radius 150
[[0, 1, 2, 3,... ..., 380, 381, 382, 383]]

m = 0 (nbhd size: 16)
m = 1 (nbhd size: 16)
m = 2 (nbhd size: 18)
...
...
m = 382 (nbhd size: 16)
m = 383 (nbhd size: 16)
*** MS5 Elapsed time for detect_spikes: 24.436 seconds ***
Removing duplicate times
*** MS5 Elapsed time for remove_duplicate_times: 0.004 seconds ***
Extracting 311186 snippets
*** MS5 Elapsed time for extract_snippets: 8.579 seconds ***
Computing PCA features with npca=1152
*** MS5 Elapsed time for compute_pca_features: 252.492 seconds ***
Isosplit6 clustering with npca_per_subdivision=10
*** MS5 Elapsed time for isosplit6_subdivision_method: 1129.154 seconds ***
Computing templates
*** MS5 Elapsed time for compute_templates: 109.424 seconds ***
Determining optimal alignment of templates
Align templates offsets:  [-1  0  1 31  0  0  2 15 36  2  1  1  0  3  4 -1  1  2  0 -1  1  0  0  4
  0  0  0 37  2  0  2  1  2  0 32  0  1  1  2  1  1  1  2  3  0  0  1  1
  0  3 -4  0  1  0  2  2  0 -1 -2 -2  3  3  1  0  2  0  0 -1  0  0  0  0
  0  0  0 -2 34 -1  0  0  0  0  0  0  0  0  0  0  1 -1  2  0 34  2  1  1
  1 -1  0  0 -2  0  0 -2  0 -1  0 -1  0  1  4 12  0 -1  1  3  2  1  1  1
  1  1  4 -2  1 -1 -5  0  0  0  0 -1  1  0  0 -2 -1 38  1  0  2  1  7  1
  2  2  0  2  0 39 -4 -1  0  0  0  2  2  0 55  0  0  0  0  1  0  0  1  2
  1  1 -1  0  0  0  1  0  0 -1 -1  0 -2  1  0  0  0  0  1  0  0  2  0  0
  0  1  5  3  0  0  0  3  1  1  2  1  1  4  0  0 -1 -1 -1 -1  0  0  0  2
  1  0 -2  0  0  0 -1  0 -3 10  0 -2 -2 -1 -2 -4  1  0  0  0  0 -1  1 -1
 -1 -1 -2  0 -2 -1 -2  0  0  0  1  0 -1 -1 -1  0  0  0  0 -1  0]
*** MS5 Elapsed time for align_templates: 64.713 seconds ***
Aligning snippets
*** MS5 Elapsed time for align_snippets: 28.420 seconds ***
Clustering aligned snippets
Computing PCA features with npca=1152
*** MS5 Elapsed time for compute_pca_features: 257.386 seconds ***
Isosplit6 clustering with npca_per_subdivision=10
*** MS5 Elapsed time for isosplit6_subdivision_method: 1081.633 seconds ***
Found 202 clusters
Computing templates
*** MS5 Elapsed time for compute_templates: 131.798 seconds ***
Offsetting times to peak
Offsets to peak: [  0   0   2   3   0   0   2 -20  -1   1   1   2  -1   0  -1   1   0   0
   0  -6  -6   0   0   0   1   0   0   4   2   1   1   1   1   1   0   0
   0   0   0  -2   0  -1   0   0   0   0   0  -1   0   1   0  12   4   2
   0  -2  10   0   0   1  -1   0   0   0   1   0   0   0   0   0   0   5
  -1   1   1   0   1   0  -2   1  -1  -5   1   1   2   2   1   0   2   0
   0   0   1   2   2  -1   2   0   2  -1  -4   0  -2  -1  -1   0  -1   0
  -1  -1   0   1   0   1   0   0   0   0  15   0   2   0   0   0   1   0
   0   0   0   3   3   1   1   0   0   0   0   0   1   0   0  -1   1   2
   1   2  -1   0  -1   0   0   0  -1   0   0   0   0   0   1   1   0   0
   2  -1  -2  -2  -2   0   0   0   4   0 -11   0  -1  15   0  -2  -1  -1
   0   1   0   0   0   0  -1  -1   0  -2  -1   0  -1  -1   0   1   0   0
  -1   0   0   0]
*** MS5 Elapsed time for determine_offsets_to_peak: 0.049 seconds ***
Sorting times
*** MS5 Elapsed time for sorting times: 0.006 seconds ***
Removing out of bounds times
*** MS5 Elapsed time for removing out of bounds times: 0.003 seconds ***
Reordering units
*** MS5 Elapsed time for reordering units: 0.034 seconds ***
Creating sorting object
*** MS5 Elapsed time for creating sorting object: 0.057 seconds ***
:::::::::::::::::::: PROCESSOR ELAPSED TIME: 3765.651 s
Writing output NWB file
Traceback (most recent call last):
  File "/app/main.py", line 255, in <module>
    app.run()
  File "/src/protocaas/python/protocaas/sdk/App.py", line 77, in run
    return self._run_job(job_id=JOB_ID, job_private_key=JOB_PRIVATE_KEY)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/protocaas/python/protocaas/sdk/App.py", line 190, in _run_job
    processor_class.run(context)
  File "/app/main.py", line 249, in run
    Mountainsort5Processor.run(context0)
  File "/app/main.py", line 177, in run
    with pynwb.NWBHDF5IO(file=f, mode='r', load_namespaces=True) as io:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hdmf/utils.py", line 664, in func_call
    return func(args[0], **pargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pynwb/__init__.py", line 236, in __init__
    super().load_namespaces(tm, path, file=file_obj, driver=driver)
  File "/usr/local/lib/python3.11/site-packages/hdmf/utils.py", line 664, in func_call
    return func(args[0], **pargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hdmf/backends/hdf5/h5tools.py", line 173, in load_namespaces
    return cls.__load_namespaces(namespace_catalog, namespaces, open_file_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hdmf/backends/hdf5/h5tools.py", line 182, in __load_namespaces
    namespace_versions = cls.__get_namespaces(file_obj)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/hdmf/backends/hdf5/h5tools.py", line 250, in __get_namespaces
    spec_group = file_obj[file_obj.attrs[SPEC_LOC_ATTR]]
                 ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/usr/local/lib/python3.11/site-packages/h5py/_hl/group.py", line 353, in __getitem__
    oid = h5r.dereference(name, self.id)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5r.pyx", line 83, in h5py.h5r.dereference
KeyError: 'Unable to open object by token (bad object header version number)'
magland commented 1 year ago

@luiztauffer It seems that this is a problem in loading the namespaces. To troubleshoot, we need the URL of the file. I realize this is an embargoed dataset. Would you be able to add me as a collaborator so I can take a look?

magland commented 1 year ago

... or else upload the nwb file to a public location?