flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
640 stars 370 forks source link

problem running nwb tutorial #962

Closed bendichter closed 2 years ago

bendichter commented 2 years ago

For better support, please use the template below to submit your issue. When your issue gets resolved please remember to close it.

Sometimes errors while running CNMF occur during parallel processing which prevents the log to provide a meaningful error message. Please reproduce your error with setting dview=None.

If you need to upgrade CaImAn follow the instructions given in the documentation.

*You can get the CaImAn version by creating a params object and then typing params.data['caiman_version']. If the field doesn't exist, type N/A and consider upgrading)

Get error when running notebook

opts_dict = {'fnames': fnames,
            'fr': fr,
            'decay_time': decay_time,
            'dxy': dxy,
            'strides': strides,
            'overlaps': overlaps,
            'max_shifts': max_shifts,
            'max_deviation_rigid': max_deviation_rigid,
            'pw_rigid': pw_rigid,
            'border_nan': 'copy',
            'var_name_hdf5': 'acquisition/mov',    
            'p': p,
            'nb': gnb,
            'rf': rf,
            'K': K, 
            'stride': stride_cnmf,
            'method_init': method_init,
            'rolling_sum': True,
            'only_init': True,
            'ssub': ssub,
            'tsub': tsub,
            'gSig': gSig,
            'merge_thr': merge_thr, 
            'min_SNR': min_SNR,
            'rval_thr': rval_thr,
            'use_cnn': True,
            'min_cnn_thr': cnn_thr,
            'cnn_lowest': cnn_lowest}

opts = params.CNMFParams(params_dict=opts_dict)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Input In [6], in <cell line: 30>()
      1 opts_dict = {'fnames': fnames,
      2             'fr': fr,
      3             'decay_time': decay_time,
   (...)
     27             'min_cnn_thr': cnn_thr,
     28             'cnn_lowest': cnn_lowest}
---> 30 opts = params.CNMFParams(params_dict=opts_dict)

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:883, in CNMFParams.__init__(self, fnames, dims, dxy, border_pix, del_duplicates, low_rank_background, memory_fact, n_processes, nb_patch, p_ssub, p_tsub, remove_very_bad_comps, rf, stride, check_nan, n_pixels_per_process, k, alpha_snmf, center_psf, gSig, gSiz, init_iter, method_init, min_corr, min_pnr, gnb, normalize_init, options_local_NMF, ring_size_factor, rolling_length, rolling_sum, ssub, ssub_B, tsub, block_size_spat, num_blocks_per_run_spat, block_size_temp, num_blocks_per_run_temp, update_background_components, method_deconvolution, p, s_min, do_merge, merge_thresh, decay_time, fr, min_SNR, rval_thr, N_samples_exceptionality, batch_update_suff_stat, expected_comps, iters_shape, max_comp_update_shape, max_num_added, min_num_trial, minibatch_shape, minibatch_suff_stat, n_refit, num_times_comp_updated, simultaneously, sniper_mode, test_both, thresh_CNN_noisy, thresh_fitness_delta, thresh_fitness_raw, thresh_overlap, update_freq, update_num_comps, use_dense, use_peak_max, only_init_patch, var_name_hdf5, max_merge_area, use_corr_img, params_dict)
    844 self.motion = {
    845     'border_nan': 'copy',               # flag for allowing NaN in the boundaries
    846     'gSig_filt': None,                  # size of kernel for high pass spatial filtering in 1p data
   (...)
    864     'indices': (slice(None), slice(None))  # part of FOV to be corrected
    865 }
    867 self.ring_CNN = {
    868     'n_channels' : 2,                   # number of "ring" kernels   
    869     'use_bias' : False,                 # use bias in the convolutions
   (...)
    880     'reuse_model': False                # reuse an already trained model
    881 }
--> 883 self.change_params(params_dict)

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:1065, in CNMFParams.change_params(self, params_dict, verbose)
   1063     if flag:
   1064         logging.warning('No parameter {0} found!'.format(k))
-> 1065 self.check_consistency()
   1066 return self

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:892, in CNMFParams.check_consistency(self)
    890 self.data['last_commit'] = '-'.join(caiman.utils.utils.get_caiman_version())
    891 if self.data['dims'] is None and self.data['fnames'] is not None:
--> 892     self.data['dims'] = get_file_size(self.data['fnames'], var_name_hdf5=self.data['var_name_hdf5'])[0]
    893 if self.data['fnames'] is not None:
    894     if isinstance(self.data['fnames'], str):

File ~/dev/CaImAn/caiman/source_extraction/cnmf/utilities.py:1085, in get_file_size(file_name, var_name_hdf5)
   1083 elif isinstance(file_name, list):
   1084     if len(file_name) == 1:
-> 1085         dims, T = get_file_size(file_name[0], var_name_hdf5=var_name_hdf5)
   1086     else:
   1087         dims, T = zip(*[get_file_size(fn, var_name_hdf5=var_name_hdf5)
   1088             for fn in file_name])

File ~/dev/CaImAn/caiman/source_extraction/cnmf/utilities.py:1030, in get_file_size(file_name, var_name_hdf5)
   1027         else:
   1028             logging.error('The file does not contain a variable' +
   1029                           'named {0}'.format(var_name_hdf5))
-> 1030             raise Exception('Variable not found. Use one of the above')
   1031     T, dims = siz[0], siz[1:]
   1032 elif extension in ('.n5', '.zarr'):

Exception: Variable not found. Use one of the above

I can't figure out what this exception means.

pgunn commented 2 years ago

What are the names and types of the objects in your hdf5 file? Normally this means that you have more than one object in there, and you set var_name_hdf5 to something, but that something is not present in the file.

(sorry if the wording of the error is not clear; I will think about ways to better explain this in the message)

bendichter commented 2 years ago

Thanks for the help, @pgunn

The Dataset that contains the 2p data is at /acquisition/TwoPhotonSeries/data. I tried changing this dict with

            'var_name_hdf5': 'acquisition/TwoPhotonSeries/data',    

but now I get a new error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [25], in <cell line: 30>()
      1 opts_dict = {'fnames': fnames,
      2             'fr': fr,
      3             'decay_time': decay_time,
   (...)
     27             'min_cnn_thr': cnn_thr,
     28             'cnn_lowest': cnn_lowest}
---> 30 opts = params.CNMFParams(params_dict=opts_dict)

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:883, in CNMFParams.__init__(self, fnames, dims, dxy, border_pix, del_duplicates, low_rank_background, memory_fact, n_processes, nb_patch, p_ssub, p_tsub, remove_very_bad_comps, rf, stride, check_nan, n_pixels_per_process, k, alpha_snmf, center_psf, gSig, gSiz, init_iter, method_init, min_corr, min_pnr, gnb, normalize_init, options_local_NMF, ring_size_factor, rolling_length, rolling_sum, ssub, ssub_B, tsub, block_size_spat, num_blocks_per_run_spat, block_size_temp, num_blocks_per_run_temp, update_background_components, method_deconvolution, p, s_min, do_merge, merge_thresh, decay_time, fr, min_SNR, rval_thr, N_samples_exceptionality, batch_update_suff_stat, expected_comps, iters_shape, max_comp_update_shape, max_num_added, min_num_trial, minibatch_shape, minibatch_suff_stat, n_refit, num_times_comp_updated, simultaneously, sniper_mode, test_both, thresh_CNN_noisy, thresh_fitness_delta, thresh_fitness_raw, thresh_overlap, update_freq, update_num_comps, use_dense, use_peak_max, only_init_patch, var_name_hdf5, max_merge_area, use_corr_img, params_dict)
    844 self.motion = {
    845     'border_nan': 'copy',               # flag for allowing NaN in the boundaries
    846     'gSig_filt': None,                  # size of kernel for high pass spatial filtering in 1p data
   (...)
    864     'indices': (slice(None), slice(None))  # part of FOV to be corrected
    865 }
    867 self.ring_CNN = {
    868     'n_channels' : 2,                   # number of "ring" kernels   
    869     'use_bias' : False,                 # use bias in the convolutions
   (...)
    880     'reuse_model': False                # reuse an already trained model
    881 }
--> 883 self.change_params(params_dict)

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:1065, in CNMFParams.change_params(self, params_dict, verbose)
   1063     if flag:
   1064         logging.warning('No parameter {0} found!'.format(k))
-> 1065 self.check_consistency()
   1066 return self

File ~/dev/CaImAn/caiman/source_extraction/cnmf/params.py:892, in CNMFParams.check_consistency(self)
    890 self.data['last_commit'] = '-'.join(caiman.utils.utils.get_caiman_version())
    891 if self.data['dims'] is None and self.data['fnames'] is not None:
--> 892     self.data['dims'] = get_file_size(self.data['fnames'], var_name_hdf5=self.data['var_name_hdf5'])[0]
    893 if self.data['fnames'] is not None:
    894     if isinstance(self.data['fnames'], str):

File ~/dev/CaImAn/caiman/source_extraction/cnmf/utilities.py:1085, in get_file_size(file_name, var_name_hdf5)
   1083 elif isinstance(file_name, list):
   1084     if len(file_name) == 1:
-> 1085         dims, T = get_file_size(file_name[0], var_name_hdf5=var_name_hdf5)
   1086     else:
   1087         dims, T = zip(*[get_file_size(fn, var_name_hdf5=var_name_hdf5)
   1088             for fn in file_name])

File ~/dev/CaImAn/caiman/source_extraction/cnmf/utilities.py:1022, in get_file_size(file_name, var_name_hdf5)
   1020 elif var_name_hdf5 in f:
   1021     if extension == '.nwb':
-> 1022         siz = f[var_name_hdf5]['data'].shape
   1023     else:
   1024         siz = f[var_name_hdf5].shape

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File ~/opt/miniconda3/envs/caiman/lib/python3.9/site-packages/h5py/_hl/dataset.py:506, in Dataset.__getitem__(self, args)
    502     new_dtype = readtime_dtype(new_dtype, names)
    503 else:
    504     # This is necessary because in the case of array types, NumPy
    505     # discards the array information at the top level.
--> 506     new_dtype = readtime_dtype(self.id.dtype, names)
    507 mtype = h5t.py_create(new_dtype)
    509 # === Special-case region references ====

File ~/opt/miniconda3/envs/caiman/lib/python3.9/site-packages/h5py/_hl/dataset.py:48, in readtime_dtype(basetype, names)
     45     return basetype
     47 if basetype.names is None:  # Names provided, but not compound
---> 48     raise ValueError("Field names only allowed for compound types")
     50 for name in names:  # Check all names are legal
     51     if not name in basetype.names:

ValueError: Field names only allowed for compound types
bendichter commented 2 years ago
image
pgunn commented 2 years ago

I haven't worked with files with a deep hierarchy for awhile; do you have any small ones of these that you might be able to share with me?

(alternatively, it would be pretty easy to write a small bit of custom code that would pull just the data you intend to process out and drop it into an hdf5 file without any of the rest of the structure)

bendichter commented 2 years ago

Once we fix this, would it be possible to incorporate this tutorial into the CI?

https://drive.google.com/file/d/1GOczZsy9ITzr5hX41kLdkhmcvckM6n73/view?usp=sharing

uploading now. Should be finished in 5 minutes

pgunn commented 2 years ago

Try this value for var_name_h5:

'var_name_hdf5': 'acquisition/TwoPhotonSeries'

We can better document this once we've figured it out.

bendichter commented 2 years ago

That worked. now

%%capture
#%% Run piecewise-rigid motion correction using NoRMCorre
mc.motion_correct(save_movie=True)
m_els = cm.load(mc.fname_tot_els)
border_to_0 = 0 if mc.border_nan is 'copy' else mc.border_to_0 
    # maximum shift to be used for trimming against NaNs
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [32], in <cell line: 2>()
      1 #%% Run piecewise-rigid motion correction using NoRMCorre
----> 2 mc.motion_correct(save_movie=True)
      3 m_els = cm.load(mc.fname_tot_els)
      4 border_to_0 = 0 if mc.border_nan is 'copy' else mc.border_to_0

File ~/dev/CaImAn/caiman/motion_correction.py:247, in MotionCorrect.motion_correct(self, template, save_movie)
    245 for _ in range(400):
    246     try:
--> 247         mi = min(mi, next(iterator).min()[()])
    248     except StopIteration:
    249         break

File ~/dev/CaImAn/caiman/base/movies.py:2263, in load_iter(file_name, subindices, var_name_hdf5, outtype)
   2260 Y = f.get('acquisition/' + var_name_hdf5 + '/data'
   2261            if extension == '.nwb' else var_name_hdf5)
   2262 if subindices is None:
-> 2263     for y in Y:
   2264         yield y.astype(outtype)
   2265 else:

TypeError: 'NoneType' object is not iterable
pgunn commented 2 years ago

Strange that that code is hoping for a different internal structure.

Do you mind giving this a go?

#!/usr/bin/env python

import argparse
import code # for code.interact(local=dict(globals(), **locals()) ) debugging
import os
import h5py

######################
# nwb_to_flat
#
# What the program is for

def main():
        cfg = handle_args()
        if os.path.isfile(cfg.dest):
                raise Exception("Destination file already exists")
        with h5py.File(cfg.src, 'r') as src:
                if cfg.path not in src:
                        raise Exception('Could not find path within the source file')
                with h5py.File(cfg.dest, 'w') as dest:
                        dest.create_dataset('data', data = src[cfg.path])

def handle_args():
        parser = argparse.ArgumentParser(description="This pulls h5py data out of a nwb file and puts it into a flat hdf5 file")
        parser.add_argument("src",      help="Source nwb file")
        parser.add_argument("path",     help="Data path in source file")
        parser.add_argument("dest",     help="Dest h5py file")
        ret = parser.parse_args()
        return ret

#####
main()

Invoke as something like ./nwb_to_flat Downloads/Sue_2x_3000_40_-46.nwb acquisition/TwoPhotonSeries/data new.h5

I'll need to figure out what data structure the code should look for (I might adjust that path logic in caiman/base/movies.py to not stick 'acquisition/' at the front of the path like that), but the script above sidesteps all of that and may be what you need.

bendichter commented 2 years ago

In NWB, the source data should be in /acquisition/TwoPhotonSeries/data. The code you gave me worked, and spit out a new h5 files with just data, but I don't think this would be a great solution for real 2p data since it tends to be quite large.

bendichter commented 2 years ago

changing to

            'var_name_hdf5': 'TwoPhotonSeries',    

appears to have worked

pgunn commented 2 years ago

Tomorrow I'll add a readme for working with nwb files, to make it more clear how people using Caiman can do this; it was added sometime back and never got great docs, so it's not surprising that people won't know how to use it out-of-the-box.

bendichter commented 2 years ago

Ok but what we really need is for this notebook to be fixed

pgunn commented 2 years ago

What changes do you suggest to the notebook? Maybe I've missed something but I don't see how the notebook is a problem.

bendichter commented 2 years ago

The only line that needed to be changed for me was

'var_name_hdf5': 'acquisition/mov',

Maybe I had old example data. I'll look into that

pgunn commented 2 years ago

Oh, sorry! I understand what you're talking about now; this isn't a demo in a notebook; it's general/demo_pipeline_NWB.py . My bad; I'll dig into this tomorrow and make the needed adjustments.

pgunn commented 2 years ago

I just did a fresh checkout of the sources; it worked out-of-the-box for me (the value of var_name_hdf5 is TwoPhotonSeries, in the version of the demo in the source tree). That hasn't changed since it was initially committed.

My guess is that you may have adjusted it to tell it to look at your data?

bendichter commented 2 years ago

I'm looking at this one: https://github.com/flatironinstitute/CaImAn/blob/master/use_cases/NWB/demo_pipeline_nwb.ipynb

pgunn commented 2 years ago

Ah. I haven't been looking at all in that directory (or maintaining it); it's kind of an attic of neglected code. I will adjust that file (but know that anything you find in use_cases may have suffered from neglect).

pgunn commented 2 years ago

The issue is now fixed in the dev branch.

bendichter commented 2 years ago

thanks!