flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
630 stars 367 forks source link

Error saving hdf5 after cnmfe #947

Closed jesusdpa1 closed 2 years ago

jesusdpa1 commented 2 years ago

For better support, please use the template below to submit your issue. When your issue gets resolved please remember to close it.

Sometimes errors while running CNMF occur during parallel processing which prevents the log to provide a meaningful error message. Please reproduce your error with setting dview=None.

If you need to upgrade CaImAn follow the instructions given in the documentation.

Cores: 16 Memory: 128 GPU: 2080ti

Recording type: 1p Methods: Inscopix Data transformation: Inscopix Python IsxtoTiff Motion Correction: Caiman Cudatoolkit: Conda installation

Video Size: ~44.000frames

Motion correction worked smoothly,

CNMFE seems to run without any problems,

Error while saving the extracted components.

USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json
3/3 [==============================] - 50s 77ms/step
     1647560 [utilities.py:        detrend_df_f():349][36442] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 183, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 172, in main
    cnm.save(path_to_save)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/source_extraction/cnmf/cnmf.py", line 669, in save
    save_dict_to_hdf5(self.__dict__, filename)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 431, in save_dict_to_hdf5
    recursively_save_dict_contents_to_group(h5file, subdir, dic)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 526, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item.__dict__)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 511, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 498, in recursively_save_dict_contents_to_group
    raise ValueError('Error while saving {}.'.format(key))
ValueError: Error while saving var_name_hdf5.

Full log file:

cnfme_15051210.log

PNR image image

jesusdpa1 commented 2 years ago

Hi, following up with this error. It seems that the cnmfe step is where the issue might be at. I recorded again just to make sure that the neurons where dynamic, and run the pipeline getting the same error:

USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json
1/1 [==============================] - 27s 27s/step
      187778 [utilities.py:        detrend_df_f():349][72482] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 183, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 172, in main
    cnm.save(path_to_save)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/source_extraction/cnmf/cnmf.py", line 669, in save
    save_dict_to_hdf5(self.__dict__, filename)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 431, in save_dict_to_hdf5
    recursively_save_dict_contents_to_group(h5file, subdir, dic)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 526, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item.__dict__)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 511, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 498, in recursively_save_dict_contents_to_group
    raise ValueError('Error while saving {}.'.format(key))
ValueError: Error while saving var_name_hdf5.

Then I tested with a previously recorded and process file using an older version where we detected 166 components, and I got 0 components detected using the newer version,

Caiman 1.9.4

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  warnings.warn(
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/sklearn/linear_model/_base.py:133: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLars())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  warnings.warn(
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/dia.py:342: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/numpy/core/fromnumeric.py:43: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  result = getattr(asarray(obj), method)(*args, **kwds)
     3058697 [components_evaluation.py:classify_components_ep():243][22236] Component 26 is only active jointly with neighboring components. Space correlation calculation might be unreliable.
     3059411 [components_evaluation.py:classify_components_ep():243][22236] Component 36 is only active jointly with neighboring components. Space correlation calculation might be unreliable.
2021-11-24 21:16:31.811852: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-24 21:16:31.941912: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-24 21:16:32.100122: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2021-11-24 21:16:36.044982: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-11-24 21:16:36.369105: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2245615000 Hz
USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json
7/7 [==============================] - 5s 15ms/step
     3071411 [utilities.py:        detrend_df_f():349][20381] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
     3072922 [stats.py:       df_percentile():219][20381] Percentile computation failed. Duplicating and trying again.
Oops!
 *****
Number of total components:  166
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 177, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 170, in main
    print('Number of accepted components: ', len(cnm.estimates.idx_components))
TypeError: object of type 'NoneType' has no len()

Caiman 1.9.7

/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/_dia.py:338: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/home/jesus.penalozaa/.local/lib/python3.9/site-packages/numpy/core/fromnumeric.py:43: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  result = getattr(asarray(obj), method)(*args, **kwds)
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/_dia.py:338: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/_dia.py:338: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/_dia.py:338: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/home/jesus.penalozaa/.local/lib/python3.9/site-packages/numpy/core/fromnumeric.py:43: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  result = getattr(asarray(obj), method)(*args, **kwds)
/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/scipy/sparse/_dia.py:338: RuntimeWarning: divide by zero encountered in remainder
  c = np.arange(num_rows, dtype=np.intc) - (offsets % max_dim)[:, None]
/home/jesus.penalozaa/.local/lib/python3.9/site-packages/numpy/core/fromnumeric.py:43: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  result = getattr(asarray(obj), method)(*args, **kwds)
     1682492 [components_evaluation.py:classify_components_ep():243][84311] Component 25 is only active jointly with neighboring components. Space correlation calculation might be unreliable.
2022-02-07 22:42:58.214982: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9646 MB memory:  -> device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:db:00.0, compute capability: 7.5
2022-02-07 22:43:08.112345: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8201
2022-02-07 22:43:39.812630: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Running ptxas --version returned 32512
2022-02-07 22:43:39.925601: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: ptxas exited with non-zero error code 32512, output:
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json
6/6 [==============================] - 68s 135ms/step
     1757567 [utilities.py:        detrend_df_f():349][84234] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
     1760216 [stats.py:       df_percentile():219][84234] Percentile computation failed. Duplicating and trying again.
Oops!
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 183, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 172, in main
    cnm.save(path_to_save)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/source_extraction/cnmf/cnmf.py", line 669, in save
    save_dict_to_hdf5(self.__dict__, filename)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 431, in save_dict_to_hdf5
    recursively_save_dict_contents_to_group(h5file, subdir, dic)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 526, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item.__dict__)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 511, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item)
  File "/blue/krauseeg/jesus.penalozaa/.conda/envs/caiman/lib/python3.9/site-packages/caiman/utils/utils.py", line 498, in recursively_save_dict_contents_to_group
    raise ValueError('Error while saving {}.'.format(key))
ValueError: Error while saving var_name_hdf5.

I do have notice a difference in the behavior of TensorFlow > 2.4 where it is required to export $CONDA_PREFIX/lib to have access to the condatoolkit & cudnn installed using conda, so not sure if that is the reason why caiman is giving me the error,

pgunn commented 2 years ago

You mention Caiman 1.6.X versions; I think those are all pretty old. Is that a typo?

jesusdpa1 commented 2 years ago

Hi, Yes sorry, its the 1.9.X earlier versions

pgunn commented 2 years ago

@jesusdpa1 I have improved the logging in utils/utils.py that should give us a better idea what's going on with your dataset; would you mind grabbing this from the dev branch and monkey-patching it into your install?

https://raw.githubusercontent.com/flatironinstitute/CaImAn/0d906e5471f0599a6278d2ff150029e4989954c5/caiman/utils/utils.py

jesusdpa1 commented 2 years ago

Sure! @pgunn, just to make sure I understand, you mean copying this file and replacing the one in caiman/utils/ before running cnmfe again?

pgunn commented 2 years ago

Yes.

jesusdpa1 commented 2 years ago

cnfme_17517666.log

@pgunn Here is the output with the changed utils file

pgunn commented 2 years ago

Sorry, could you do that again with this?

https://raw.githubusercontent.com/flatironinstitute/CaImAn/f60112b83070030cd2f213cb66e5224938458e74/caiman/utils/utils.py

jesusdpa1 commented 2 years ago

cnfme_17522924.log

With the updated utils

pgunn commented 2 years ago

Thanks! I see what's going on, and tomorrow I'll have a patch for it.

This line: ValueError: Error while saving numeric or string var_name_hdf5: assigned value b'mov' does not match intended value mov

is reminding me that when you serialise a string into hdf5, pulling it back out gets you bits, not a python string (which is why they don't compare cleanly). I don't think this is true for all versions of the hdf5 module.

Do you mind getting me the output, in your environment, of conda list h5py ? I'm curious what version you landed with that shows this problem.

jesusdpa1 commented 2 years ago

Thanks Pat! and of course,

#

Name Version Build Channel

h5py 2.10.0 nompi_py39h98ba4bc_106 conda-forge

pgunn commented 2 years ago

Can you give this a try? I split out the code to save strings and had it explicitly save it as hdf5's string datatypes, which should hopefully solve the issue. Or at least one of the issues.

https://raw.githubusercontent.com/flatironinstitute/CaImAn/c7b38966ff2e71442592a3c41936cf68c15774fb/caiman/utils/utils.py

jesusdpa1 commented 2 years ago

Seems like the same error?

2022-02-10 00:21:34.700979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9648 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:da:00.0, compute capability: 7.5
2022-02-10 00:21:36.584633: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8201
2022-02-10 00:21:37.239519: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Running ptxas --version returned 32512
2022-02-10 00:21:37.300877: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: ptxas exited with non-zero error code 32512, output: 
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 10s 10s/step
      550594 [utilities.py:        detrend_df_f():349][65454] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 183, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 172, in main
    cnm.save(path_to_save)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/source_extraction/cnmf/cnmf.py", line 669, in save
    save_dict_to_hdf5(self.__dict__, filename)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 431, in save_dict_to_hdf5
    recursively_save_dict_contents_to_group(h5file, subdir, dic)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 530, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item.__dict__)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 515, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 497, in recursively_save_dict_contents_to_group
    raise ValueError(f'Error while saving string {path + key}: assigned value {h5file[path + key][()]} does not match intended value {item}')
ValueError: Error while saving string /params/data/var_name_hdf5: assigned value b'mov' does not match intended value mov

I am attaching my cnmfe code just in case

caiman_cnmfe.txt

cnfme_17687257.log

pgunn commented 2 years ago

Let's give this a go: https://raw.githubusercontent.com/flatironinstitute/CaImAn/26761e194bc05db6490bef8e3ea89ff2cc866933/caiman/utils/utils.py Sorry for the fuss, we'll get to the bottom of this.

jesusdpa1 commented 2 years ago

No worries, and thanks again for dedicating time to go through this.

Here is the last log report. this time it didn't created any hdf5 output

2022-02-10 09:36:41.253278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9646 MB memory:  -> device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:da:00.0, compute capability: 7.5
2022-02-10 09:36:55.614306: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8201
2022-02-10 09:37:38.525704: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Running ptxas --version returned 32512
2022-02-10 09:37:38.626641: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: ptxas exited with non-zero error code 32512, output: 
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
USING MODEL:/home/jesus.penalozaa/caiman_data/model/cnn_model.json

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 117s 117s/step
     1011215 [utilities.py:        detrend_df_f():349][98686] Background components not present. Results should not be interpreted as DF/F normalized but only as detrended.
Traceback (most recent call last):
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 183, in <module>
    main()
  File "/home/jesus.penalozaa/GitHub/hpg_remote/scripts-python/1p/caiman_cnmfe.py", line 172, in main
    cnm.save(path_to_save)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/source_extraction/cnmf/cnmf.py", line 669, in save
    save_dict_to_hdf5(self.__dict__, filename)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 431, in save_dict_to_hdf5
    recursively_save_dict_contents_to_group(h5file, subdir, dic)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 530, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item.__dict__)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 515, in recursively_save_dict_contents_to_group
    recursively_save_dict_contents_to_group(h5file, path + key + '/', item)
  File "/blue/krauseeg/pypackages/CaImAn/caiman/utils/utils.py", line 497, in recursively_save_dict_contents_to_group
    raise ValueError(f'Error (v {h5py.__version__}) while saving string {path + key}: assigned value {h5file[path + key][()]} does not match intended value {item}')
ValueError: Error (v 3.6.0) while saving string /params/data/var_name_hdf5: assigned value b'mov' does not match intended value mov

cnfme_17719700.log

pgunn commented 2 years ago

Aha. There's our answer. Even though you have h5py 2.10 installed through conda, somehow you also have 3.6.0 installed through some other means and python's finding that first.

My guess is that you made this environment as intended, and then used pip to install additional packages, with one of the other packages needing a newer version of h5py and having pulled h5py up to that level. Does that sound right? If so, we can talk through our options for trying to fix things.

jesusdpa1 commented 2 years ago

Yes! thank you!! I wonder why that happened? I download Caiman using github and followed the steps

git clone https://github.com/flatironinstitute/CaImAn
cd CaImAn/
mamba env create -f environment.yml -n caiman
source activate caiman
pip install -e .

to uninstall h5py 3.6.0, I did the following:

ml conda
conda activate caiman
pip uninstall h5py

I ran caiman after that and I got a result:

image

Thanks again for your help

pgunn commented 2 years ago

Happy to help.

In the future, if you like you can avoid the sources entirely and just use the conda-based install. It's a little simpler.