flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
613 stars 362 forks source link

Input Array Dimension for concatenation size error #1364

Open kg3354 opened 1 month ago

kg3354 commented 1 month ago

Hi,

I am trying to use caiman to get a initialization file so that I can reuse it in the future. I am passing in a tif file generated from a sequence of avi file, and the error message that i get is that

Traceback (most recent call last):
  File "demo_fiola_pipeline.py", line 239, in <module>
    main()
  File "demo_fiola_pipeline.py", line 145, in main
    caiman_file = run_caiman_init(fnames_init, pw_rigid=True,
  File "C:\Users\Research\Desktop\trash\FIOLA\fiola\demo_initialize_calcium.py", line 161, in run_caiman_init
    cnm = cnm.fit(images)
  File "c:\users\research\desktop\trash\caiman\caiman\source_extraction\cnmf\cnmf.py", line 646, in fit
    self.update_temporal(Yr, use_init=False)
  File "c:\users\research\desktop\trash\caiman\caiman\source_extraction\cnmf\cnmf.py", line 896, in update_temporal
    self.estimates.g, self.estimates.YrA, self.estimates.lam = update_temporal_components(
  File "c:\users\research\desktop\trash\caiman\caiman\source_extraction\cnmf\temporal.py", line 203, in update_temporal_components
    Cin = np.vstack((Cin, fin))
  File "<__array_function__ internals>", line 5, in vstack
  File "C:\Users\Research\.conda\envs\fiola-t\lib\site-packages\numpy\core\shape_base.py", line 283, in vstack
    return _nx.concatenate(arrs, 0)
  File "<__array_function__ internals>", line 5, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 500 and the array at index 1 has size 1

I wonder if it is due to the way I construct the tif file. I made sure it is not colored, and here is the script i used to generate the tif file:

import tifffile as tiff
import numpy as np
import os

def combine_tiffs(input_paths, output_path):
    try:
        # Ensure the output file has a .tif extension
        if not output_path.endswith(".tif"):
            output_path = output_path.rstrip(".tiff") + ".tif"

        total_frames = len(input_paths)
        print(f"Combining {total_frames} TIFF files into {output_path}")

        # Preallocate an array for the combined image using memory mapping
        first_image = tiff.imread(input_paths[0])
        image_shape = first_image.shape
        combined_shape = (total_frames, *image_shape)
        temp_file = 'temp_combined_image.dat'
        combined_image = np.memmap(temp_file, dtype=first_image.dtype, mode='w+', shape=combined_shape)
        print(f"Preallocated combined image shape: {combined_image.shape}")

        # Load and combine the images
        for idx, tiff_path in enumerate(input_paths):
            print(f"Adding TIFF: {tiff_path}")
            image = tiff.imread(tiff_path)
            combined_image[idx] = image

        # Flush the memory-mapped array to ensure all data is written
        combined_image.flush()

        # Generate metadata
        metadata = {
            'ImageDescription': f'{{"shape": [{total_frames}, {image_shape[0]}, {image_shape[1]}]}}',
            'Software': 'tifffile.py'
        }

        # Write the combined image to the output path with metadata using imwrite
        with tiff.TiffWriter(output_path, bigtiff=True) as tif:
            for idx in range(total_frames):
                tif.write(
                    combined_image[idx], 
                    photometric='minisblack', 
                    description=metadata['ImageDescription'],
                    software=metadata['Software']
                )

        # Remove the temporary file
        os.remove(temp_file)

        print(f"Combined TIFF image saved at {output_path}")

    except Exception as e:
        print(f"Error combining TIFF files: {e}", file=sys.stderr)
        sys.exit(1)

if __name__ == "__main__":
    output_path = 'result_7.tif'
    input_paths = [f'constructed_image_{i}.tiff' for i in range(700)]

    combine_tiffs(input_paths, output_path)

My script that calls caiman is:

        fnames =  '/result_7.tif'
        #fnames = folder + '/output_bw.tif'
        fr = 30                         # sample rate of the movie

        mode = 'calcium'                # 'voltage' or 'calcium' fluorescence indicator
        num_frames_init =   500        # number of frames used for initialization
        num_frames_total =  700        # estimated total number of frames for processing, this is used for generating matrix to store data
        offline_batch = 5               # number of frames for one batch to perform offline motion correction
        batch= 1                        # number of frames processing at the same time using gpu 
        flip = False                    # whether to flip signal to find spikes   
        detrend = False                 # whether to remove the slow trend in the fluorescence data
        dc_param = 0.9995               # DC blocker parameter for removing the slow trend in the fluorescence data. It is usually between
                                        # 0.99 and 1. Higher value will remove less trend. No detrending will perform if detrend=False.
        do_deconvolve = False            # If True, perform spike detection for voltage imaging or deconvolution for calcium imaging.
        ms = [55, 55]                     # maximum shift in x and y axis respectively. Will not perform motion correction if None.
        center_dims = None              # template dimensions for motion correction. If None, the input will the the shape of the FOV
        hals_movie = 'hp_thresh'        # apply hals on the movie high-pass filtered and thresholded with 0 (hp_thresh); movie only high-pass filtered (hp); 
                                        # original movie (orig); no HALS needed if the input is from CaImAn (when init_method is 'caiman' or 'weighted_masks')
        n_split = 1                     # split neuron spatial footprints into n_split portion before performing matrix multiplication, increase the number when spatial masks are larger than 2GB
        nb = 1                          # number of background components
        trace_with_neg=True             # return trace with negative components (noise) if True; otherwise the trace is cutoff at 0
        lag = 5                         # lag for retrieving the online result.

        options = {
            'fnames': fnames,
            'fr': fr,
            'mode': mode, 
            'num_frames_init': num_frames_init, 
            'num_frames_total':num_frames_total,
            'offline_batch': offline_batch,
            'batch':batch,
            'flip': flip,
            'detrend': detrend,
            'dc_param': dc_param,            
            'do_deconvolve': do_deconvolve,
            'ms': ms,
            'hals_movie': hals_movie,
            'center_dims':center_dims, 
            'n_split': n_split,
            'nb' : nb, 
            'trace_with_neg':trace_with_neg, 
            'lag': lag}

        mov = cm.load(fnames, subindices=range(num_frames_init))
        fnames_init = fnames.split('.')[0] + '_init.tif'
        mov.save(fnames_init)

        print(mov.shape)
        # run caiman initialization. User might need to change the parameters 
        # inside the file to get good initialization result
        caiman_file = run_caiman_init(fnames_init, pw_rigid=False, max_shifts=ms, gnb=nb, rf=15, K=2, gSig=[3, 3])
        print(caiman_file)

and the demo_initialize_calcium.py corresponding to run_caiman_init:

#!/usr/bin/env python

import cv2
import glob
import logging
import matplotlib.pyplot as plt
import numpy as np
import os
from time import time

try:
    cv2.setNumThreads(0)
except:
    pass

try:
    if __IPYTHON__:
        # this is used for debugging purposes only. allows to reload classes
        # when changed
        get_ipython().magic('load_ext autoreload')
        get_ipython().magic('autoreload 2')
except NameError:
    pass

import caiman as cm
from caiman.motion_correction import MotionCorrect
from caiman.source_extraction.cnmf import cnmf as cnmf
from caiman.source_extraction.cnmf import params as params
from caiman.utils.utils import download_demo
from caiman.summary_images import local_correlations_movie_offline
from caiman.source_extraction.cnmf.utilities import get_file_size

logging.basicConfig(format=
                    "%(relativeCreated)12d [%(filename)s:%(funcName)20s():%(lineno)s]"\
                    "[%(process)d] %(message)s",
                    level=logging.INFO)
#%%    
def run_caiman_init(fnames, pw_rigid = True, max_shifts=[6, 6], gnb=2, rf=15, K = 5, gSig = [4, 4]):
    """
    Run caiman for initialization.

    Parameters
    ----------
    fnames : string
        file name
    pw_rigid : bool, 
        flag to select rigid vs pw_rigid motion correction. The default is True.
    max_shifts: list
        maximum shifts allowed for x axis and y axis. The default is [6, 6].
    gnb : int
        number of background components. The default is 2.
    rf: int
        half-size of the patches in pixels. e.g., if rf=25, patches are 50x50. The default value is 15.
    K : int
        number of components per patch. The default is 5.
    gSig : list
        expected half size of neurons in pixels. The default is [4, 4].

    Returns
    -------
    output_file : string
        file with caiman output

    """
    c, dview, n_processes = cm.cluster.setup_cluster(
            backend='local', n_processes=None, single_thread=False)

    timing = {}
    timing['start'] = time()

    # dataset dependent parameters
    display_images = False

    fr = 30             # imaging rate in frames per second
    decay_time = 0.4    # length of a typical transient in seconds
    dxy = (2., 2.)      # spatial resolution in x and y in (um per pixel)
    # note the lower than usual spatial resolution here
    patch_motion_um = (100., 100.)  # patch size for non-rigid correction in um
    # motion correction parameters
    # start a new patch for pw-rigid motion correction every x pixels
    strides = tuple([int(a/b) for a, b in zip(patch_motion_um, dxy)])
    # overlap between pathes (size of patch in pixels: strides+overlaps)
    overlaps = (24, 24)
    # maximum deviation allowed for patch with respect to rigid shifts
    max_deviation_rigid = 3

    mc_dict = {
        'fnames': fnames,
        'fr': fr,
        'decay_time': decay_time,
        'dxy': dxy,
        'pw_rigid': pw_rigid,
        'max_shifts': max_shifts,
        'strides': strides,
        'overlaps': overlaps,
        'max_deviation_rigid': max_deviation_rigid,
        'border_nan': 'copy',
    }

    opts = params.CNMFParams(params_dict=mc_dict)

    # Motion correction and memory mapping
    time_init = time()
    mc = MotionCorrect(fnames, dview=dview, **opts.get_group('motion'))
    mc.motion_correct(save_movie=True)
    border_to_0 = 0 if mc.border_nan == 'copy' else mc.border_to_0
    fname_new = cm.save_memmap(mc.mmap_file, base_name='memmap_', order='C',
                               border_to_0=border_to_0)  # exclude borders
    #
    Yr, dims, T = cm.load_memmap(fname_new)
    images = np.reshape(Yr.T, [T] + list(dims), order='F')
    #  restart cluster to clean up memory
    cm.stop_server(dview=dview)
    c, dview, n_processes = cm.cluster.setup_cluster(
        backend='local', n_processes=None, single_thread=False)
    #
    f_F_mmap = mc.mmap_file[0]
    Cns = local_correlations_movie_offline(f_F_mmap,
                                       remove_baseline=True, window=1000, stride=1000,
                                       winSize_baseline=100, quantil_min_baseline=10,
                                       dview=dview)
    Cn = Cns.max(axis=0)
    Cn[np.isnan(Cn)] = 0
    # if display_images: 

    plt.imshow(Cn,vmax=0.5)
    #   parameters for source extraction and deconvolution
    p = 1                    # order of the autoregressive system
    merge_thr = 0.85         # merging threshold, max correlation allowed
    stride_cnmf = 6          # amount of overlap between the patches in pixels
    # initialization method (if analyzing dendritic data using 'sparse_nmf')
    method_init = 'greedy_roi'
    ssub = 2                     # spatial subsampling during initialization
    tsub = 2                     # temporal subsampling during intialization

    # parameters for component evaluation
    opts_dict = {'fnames': fnames,
                 'p': p,
                 'fr': fr,
                 'nb': gnb,
                 'rf': rf,
                 'K': K,
                 'gSig': gSig,
                 'stride': stride_cnmf,
                 'method_init': method_init,
                 'rolling_sum': True,
                 'merge_thr': merge_thr,
                 'n_processes': n_processes,
                 'only_init': True,
                 'ssub': ssub,
                 'tsub': tsub}

    opts.change_params(params_dict=opts_dict);
    #  RUN CNMF ON PATCHES
    cnm = cnmf.CNMF(n_processes, params=opts, dview=dview)
    cnm = cnm.fit(images)
    #  COMPONENT EVALUATION
    min_SNR = 1.0  # signal to noise ratio for accepting a component
    rval_thr = 0.75  # space correlation threshold for accepting a component
    cnn_thr = 0.3  # threshold for CNN based classifier
    cnn_lowest = 0.0 # neurons with cnn probability lower than this value are rejected

    cnm.params.set('quality', {'decay_time': decay_time,
                           'min_SNR': min_SNR,
                           'rval_thr': rval_thr,
                           'use_cnn': False,
                           'min_cnn_thr': cnn_thr,
                           'cnn_lowest': cnn_lowest})
    cnm.estimates.evaluate_components(images, cnm.params, dview=dview)
    print(len(cnm.estimates.idx_components))
    time_patch = time()
    #
    if display_images:
        cnm.estimates.plot_contours(img=Cn, idx=cnm.estimates.idx_components)
    #
    cnm.estimates.select_components(use_object=True)
    # RE-RUN seeded CNMF on accepted patches to refine and perform deconvolution
    cnm2 = cnm.refit(images, dview=dview)
    time_end = time() 
    print(time_end- time_init)
    #  COMPONENT EVALUATION
    min_SNR = 2  # signal to noise ratio for accepting a component
    rval_thr = 0.85  # space correlation threshold for accepting a component
    cnn_thr = 0.15  # threshold for CNN based classifier
    cnn_lowest = 0.0 # neurons with cnn probability lower than this value are rejected

    cnm2.params.set('quality', {'decay_time': decay_time,
                               'min_SNR': min_SNR,
                               'rval_thr': rval_thr,
                               'use_cnn': False,
                               'min_cnn_thr': cnn_thr,
                               'cnn_lowest': cnn_lowest})
    cnm2.estimates.evaluate_components(images, cnm2.params, dview=dview)
    print(len(cnm2.estimates.idx_components))

    #  PLOT COMPONENTS
    cnm2.estimates.plot_contours(img=Cn, idx=cnm2.estimates.idx_components)
    #  VIEW TRACES (accepted and rejected)
    if display_images:
        cnm2.estimates.view_components(images, img=Cn,
                                      idx=cnm2.estimates.idx_components)
        cnm2.estimates.view_components(images, img=Cn,
                                      idx=cnm2.estimates.idx_components_bad)
    # update object with selected components
    cnm2.estimates.select_components(use_object=True)
    # Extract DF/F values
    cnm2.estimates.detrend_df_f(quantileMin=8, frames_window=250)
    # Show final traces
    #cnm2.estimates.view_components(img=Cn)
    #
    cnm2.mmap_F = f_F_mmap 
    cnm2.estimates.Cn = Cn
    cnm2.estimates.template = mc.total_template_rig
    cnm2.estimates.shifts = mc.shifts_rig
    save_name = cnm2.mmap_file[:-5] + '_caiman_init.hdf5'

    timing['end'] = time()
    print(timing)
    cnm2.save(save_name)
    print(save_name)
    output_file = save_name
    # STOP CLUSTER and clean up log files
    cm.stop_server(dview=dview)
    log_files = glob.glob('*_LOG_*')
    for log_file in log_files:
        os.remove(log_file)
    plt.close('all')        
    return output_file

Sorry for throwing out chunks of code. I am new to caiman, and any help would be really apprecaited!

Your setup:

  1. Operating System (Linux, MacOS, Windows): Windows
  2. Hardware type (x86, ARM..) and RAM: x64
  3. Python Version (e.g. 3.9): 3.8
  4. Caiman version (e.g. 1.9.12): v1.9.13
  5. How you installed Caiman (pure conda, conda + compile, colab, ..): conda
  6. Details:
pgunn commented 2 weeks ago

Hello, Sorry for the late reply; I was outside the country for a conference.

There are several ways you can do this; I believe you probably can just pass in the list of original files instead of merging them, or if the filenames don't actually represent different sessions, you can do a flatter merge that doesn't preserve the distinctiveness of different filenames.

The former is probably better if the data came from multiple sessions (in which case you'll want to look at the demos where we cover how multisession registration works).

Cheers, Pat

kg3354 commented 1 week ago

Hi Pat, Thanks for reaching out! I solved that issue, now I am facing another one.

I am using tensorflow 2.4.1, CaImAn version 1.9.13, and python 3.8.0, and I am trying to use GPU during the CaImAn motion correction. I made sure to import pycuda.autoinit and used

    motion_opts = opts.get_group('motion')
    motion_opts['use_cuda'] = True

    print(motion_opts)

    logging.info(motion_opts)

    mc = MotionCorrect(fnames, dview=dview, **motion_opts)

the MotionCorrect method in CaImAn returns an cuMemAlloc failed: initialization error. The detailed logs are:

        6104 [motion_correction.py:motion_correction_piecewise():3218][112128] ** Starting parallel motion correction **
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 3124, in tile_and_correct_wrapper
    mc[count], total_shift, start_step, xy_grid = tile_and_correct(img, template, strides, overlaps, max_shifts,
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 2199, in tile_and_correct
    rigid_shts, sfr_freq, diffphase = register_translation(
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 1700, in register_translation
    image_gpu = gpuarray.to_gpu(np.stack((src_image, target_image)).astype(np.complex128))
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/site-packages/pycuda/gpuarray.py", line 1294, in to_gpu
    result = GPUArray(ary.shape, ary.dtype, allocator, strides=_compact_strides(ary))
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/site-packages/pycuda/gpuarray.py", line 268, in __init__
    self.gpudata = self.allocator(self.size * self.dtype.itemsize)
pycuda._driver.LogicError: cuMemAlloc failed: initialization error
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "simple.py", line 223, in <module>
    main()
  File "simple.py", line 55, in main
    caiman_file = run_caiman_init(fnames_init, pw_rigid=True, max_shifts=ms, gnb=nb, rf=15, K=4, gSig=[3, 3])
  File "simple.py", line 103, in run_caiman_init
    mc.motion_correct(save_movie=True)
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 273, in motion_correct
    self.motion_correct_pwrigid(template=template, save_movie=save_movie)
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 372, in motion_correct_pwrigid
    self.motion_correct_rigid()
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 317, in motion_correct_rigid
    _fname_tot_rig, _total_template_rig, _templates_rig, _shifts_rig = motion_correct_batch_rigid(
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 2904, in motion_correct_batch_rigid
    fname_tot_rig, res_rig = motion_correction_piecewise(fname, splits, strides=None, overlaps=None,
  File "/home/research/research/Corelink_FIOLA/k8s/init/motion_correction.py", line 3220, in motion_correction_piecewise
    res = dview.map(tile_and_correct_wrapper,pars)
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/research/anaconda3/envs/fiola/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
pycuda._driver.LogicError: cuMemAlloc failed: initialization error

What might caused this error? Is there additional package I need to include?

Thanks for your time!

Kaiwen Guo

pgunn commented 1 week ago

What kind of GPU are you running it on?

kg3354 commented 1 week ago

I am running it on Nvidia GeForce RTX 3070.

Thanks

Kaiwen Guo

On Mon, Jul 8, 2024 at 6:16 PM Pat Gunn @.***> wrote:

What kind of GPU are you running it on?

— Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_flatironinstitute_CaImAn_issues_1364-23issuecomment-2D2215443545&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=PBqA4ZL9u_nR0seDcrIZyQ&m=jD7MW6QVBCylObwL7uJ-OLlTQ9GuCxvdd4erY6xUanbNq85KM7gOny688_LFHRhV&s=BB01DFZ4YPiWU1dyTNCVN0rZWKOQvf3vLNOHs7M0rK0&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_BDWJVAVEXVVLTBXZSVAFM43ZLMFUVAVCNFSM6AAAAABJK54YJGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJVGQ2DGNJUGU&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=PBqA4ZL9u_nR0seDcrIZyQ&m=jD7MW6QVBCylObwL7uJ-OLlTQ9GuCxvdd4erY6xUanbNq85KM7gOny688_LFHRhV&s=X9SADFXHLsDFp-Jr4UYt-IWwY67K1PmN4J7HUFxPrTs&e= . You are receiving this because you authored the thread.Message ID: @.***>

pgunn commented 1 week ago

Actually, I just spotted something above that surprised me; where and why did you import pycuda.autoinit?

kg3354 commented 1 week ago

The pycuda.autoinit is imported in the main script:

import caiman as cm
import logging
import matplotlib.pyplot as plt
import numpy as np
import os
import skcuda.cublas
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ["TF_XLA_FLAGS"] = "--tf_xla_enable_xla_devices"
import pyximport
pyximport.install()
import scipy
from tensorflow.python.client import device_lib
from time import time
import pickle
from fiola.fiolaparams import fiolaparams
from fiola.fiola import FIOLA
from fiola.utilities import download_demo, load, to_2D, movie_iterator

import tensorflow as tf
tf.debugging.set_log_device_placement(True)

physical_devices = tf.config.list_physical_devices('GPU')
if physical_devices:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)
else:
    print("No GPU device found")

logging.basicConfig(format="%(relativeCreated)12d [%(filename)s:%(funcName)20s():%(lineno)s][%(process)d] %(message)s",
                    level=logging.INFO)
logging.info(device_lib.list_local_devices())  # if GPU is not detected, try to reinstall tensorflow with pip install tensorflow==2.5.0
startTime = time()
import cv2
import glob
from time import time
from caiman.motion_correction import MotionCorrect
from caiman.source_extraction.cnmf import cnmf as cnmf
from caiman.source_extraction.cnmf import params as params
from caiman.utils.utils import download_demo
from caiman.summary_images import local_correlations_movie_offline
from caiman.source_extraction.cnmf.utilities import get_file_size
import pycuda.autoinit

Without the pycuda.autoinit, i get the cuInit failed: initialization error as such:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 1675, in register_translation
    cudactx # type: ignore
NameError: name 'cudactx' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/research/anaconda3/envs/caimangpu/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/research/anaconda3/envs/caimangpu/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 3116, in tile_and_correct_wrapper
    mc[count], total_shift, start_step, xy_grid = tile_and_correct(img, template, strides, overlaps, max_shifts,
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 2188, in tile_and_correct
    rigid_shts, sfr_freq, diffphase = register_translation(
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 1677, in register_translation
    init_cuda_process()
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 1389, in init_cuda_process
    cudadrv.init()
pycuda._driver.LogicError: cuInit failed: initialization error
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "generate_init_result.py", line 324, in <module>
    main()
  File "generate_init_result.py", line 271, in main
    caiman_file = run_caiman_init(fnames_init, pw_rigid=True, max_shifts=ms, gnb=nb, rf=15, K=4, gSig=[3, 3])
  File "generate_init_result.py", line 92, in run_caiman_init
    mc.motion_correct(save_movie=True)
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 259, in motion_correct
    self.motion_correct_pwrigid(template=template, save_movie=save_movie)
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 358, in motion_correct_pwrigid
    self.motion_correct_rigid()
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 303, in motion_correct_rigid
    _fname_tot_rig, _total_template_rig, _templates_rig, _shifts_rig = motion_correct_batch_rigid(
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 2893, in motion_correct_batch_rigid
    fname_tot_rig, res_rig = motion_correction_piecewise(fname, splits, strides=None, overlaps=None,
  File "/home/research/research/CaImAn/caiman/motion_correction.py", line 3212, in motion_correction_piecewise
    res = dview.map(tile_and_correct_wrapper,pars)
  File "/home/research/anaconda3/envs/caimangpu/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/research/anaconda3/envs/caimangpu/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
pycuda._driver.LogicError: cuInit failed: initialization error

Just for reference, the entire piece of code is:

#!/usr/bin/env python
import caiman as cm
import logging
import matplotlib.pyplot as plt
import numpy as np
import os
import skcuda.cublas
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ["TF_XLA_FLAGS"] = "--tf_xla_enable_xla_devices"
import pyximport
pyximport.install()
import scipy
from tensorflow.python.client import device_lib
from time import time
import pickle
from fiola.fiolaparams import fiolaparams
from fiola.fiola import FIOLA
from fiola.utilities import download_demo, load, to_2D, movie_iterator

import tensorflow as tf
tf.debugging.set_log_device_placement(True)

physical_devices = tf.config.list_physical_devices('GPU')
if physical_devices:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)
else:
    print("No GPU device found")

logging.basicConfig(format="%(relativeCreated)12d [%(filename)s:%(funcName)20s():%(lineno)s][%(process)d] %(message)s",
                    level=logging.INFO)
logging.info(device_lib.list_local_devices())  # if GPU is not detected, try to reinstall tensorflow with pip install tensorflow==2.5.0
startTime = time()
import cv2
import glob
from time import time
from caiman.motion_correction import MotionCorrect
from caiman.source_extraction.cnmf import cnmf as cnmf
from caiman.source_extraction.cnmf import params as params
from caiman.utils.utils import download_demo
from caiman.summary_images import local_correlations_movie_offline
from caiman.source_extraction.cnmf.utilities import get_file_size
#import pycuda.autoinit

#%%
def run_caiman_init(fnames, pw_rigid=True, max_shifts=[6, 6], gnb=2, rf=15, K=5, gSig=[4, 4]):
    c, dview, n_processes = cm.cluster.setup_cluster(
        backend='local', n_processes=4, single_thread=False)  # Adjust n_processes as needed

    timing = {}
    timing['start'] = time()

    # dataset dependent parameters
    display_images = False

    fr = 30  # imaging rate in frames per second
    decay_time = 0.4  # length of a typical transient in seconds
    dxy = (2., 2.)  # spatial resolution in x and y in (um per pixel)
    patch_motion_um = (100., 100.)  # patch size for non-rigid correction in um
    strides = tuple([int(a/b) for a, b in zip(patch_motion_um, dxy)])
    overlaps = (24, 24)
    max_deviation_rigid = 3

    mc_dict = {
        'fnames': fnames,
        'fr': fr,
        'decay_time': decay_time,
        'dxy': dxy,
        'pw_rigid': pw_rigid,
        'max_shifts': max_shifts,
        'strides': strides,
        'overlaps': overlaps,
        'max_deviation_rigid': max_deviation_rigid,
        'border_nan': 'copy',
    }

    opts = params.CNMFParams(params_dict=mc_dict)

    # Motion correction and memory mapping
    time_init = time()
    motion_opts = opts.get_group('motion')
    motion_opts['use_cuda'] = True

    print(motion_opts)

    logging.info(motion_opts)

    mc = MotionCorrect(fnames, dview=dview, **motion_opts)

    mc.motion_correct(save_movie=True)

    logging.info('Finished mc, before cm.save_memmap')
    border_to_0 = 0 if mc.border_nan == 'copy' else mc.border_to_0
    fname_new = cm.save_memmap(mc.mmap_file, base_name='memmap_', order='C', border_to_0=border_to_0)  # exclude borders
    logging.info('Finished exclude borders')

    Yr, dims, T = cm.load_memmap(fname_new)
    images = np.reshape(Yr.T, [T] + list(dims), order='F')
    logging.info('Finished reshaping borders')

    # Restart cluster to clean up memory
    cm.stop_server(dview=dview)
    c, dview, n_processes = cm.cluster.setup_cluster(backend='local', n_processes=4, single_thread=False)  # Adjust n_processes as needed
    logging.info('Finished restarting cluster')

    f_F_mmap = mc.mmap_file[0]
    Cns = local_correlations_movie_offline(f_F_mmap, remove_baseline=True, window=1000, stride=1000, winSize_baseline=100, quantil_min_baseline=10, dview=dview)
    logging.info('Finished local_correlations_movie_offline')
    Cn = Cns.max(axis=0)
    Cn[np.isnan(Cn)] = 0
    plt.imshow(Cn, vmax=0.5)

    # Parameters for source extraction and deconvolution
    p = 1
    merge_thr = 0.85
    stride_cnmf = 6
    method_init = 'greedy_roi'
    ssub = 2
    tsub = 2

    opts_dict = {'fnames': fnames,
                 'p': p,
                 'fr': fr,
                 'nb': gnb,
                 'rf': rf,
                 'K': K,
                 'gSig': gSig,
                 'stride': stride_cnmf,
                 'method_init': method_init,
                 'rolling_sum': True,
                 'merge_thr': merge_thr,
                 'n_processes': n_processes,
                 'only_init': True,
                 'ssub': ssub,
                 'tsub': tsub}

    opts.change_params(params_dict=opts_dict)
    cnm = cnmf.CNMF(n_processes, params=opts, dview=dview)
    cnm = cnm.fit(images)
    logging.info('Finished CNFM on patches')

    # Component evaluation
    min_SNR = 1.0
    rval_thr = 0.75
    cnn_thr = 0.3
    cnn_lowest = 0.0

    cnm.params.set('quality', {'decay_time': decay_time,
                               'min_SNR': min_SNR,
                               'rval_thr': rval_thr,
                               'use_cnn': False,
                               'min_cnn_thr': cnn_thr,
                               'cnn_lowest': cnn_lowest})
    cnm.estimates.evaluate_components(images, cnm.params, dview=dview)
    print(len(cnm.estimates.idx_components))
    time_patch = time()

    if display_images:
        cnm.estimates.plot_contours(img=Cn, idx=cnm.estimates.idx_components)

    cnm.estimates.select_components(use_object=True)
    cnm2 = cnm.refit(images, dview=dview)
    time_end = time()
    print('Total time until cnm refit is: ')
    print(time_end - time_init)

    min_SNR = 2
    rval_thr = 0.85
    cnn_thr = 0.15
    cnn_lowest = 0.0

    cnm2.params.set('quality', {'decay_time': decay_time,
                                'min_SNR': min_SNR,
                                'rval_thr': rval_thr,
                                'use_cnn': False,
                                'min_cnn_thr': cnn_thr,
                                'cnn_lowest': cnn_lowest})
    cnm2.estimates.evaluate_components(images, cnm2.params, dview=dview)
    print(len(cnm2.estimates.idx_components))

    cnm2.estimates.plot_contours(img=Cn, idx=cnm2.estimates.idx_components)

    if display_images:
        cnm2.estimates.view_components(images, img=Cn, idx=cnm2.estimates.idx_components)
        cnm2.estimates.view_components(images, img=Cn, idx=cnm2.estimates.idx_components_bad)

    cnm2.estimates.select_components(use_object=True)
    cnm2.estimates.detrend_df_f(quantileMin=8, frames_window=250)
    cnm2.mmap_F = f_F_mmap
    cnm2.estimates.Cn = Cn
    cnm2.estimates.template = mc.total_template_rig
    cnm2.estimates.shifts = mc.shifts_rig
    save_name = cnm2.mmap_file[:-5] + '_caiman_init.hdf5'

    timing['end'] = time()
    print(timing)
    cnm2.save(save_name)
    print(save_name)
    output_file = save_name

    cm.stop_server(dview=dview)
    log_files = glob.glob('*_LOG_*')
    for log_file in log_files:
        os.remove(log_file)
    plt.close('all')
    return output_file

def main():
    mode = 'calcium'
    if mode == 'calcium':
        folder = os.getenv('MOVIE_FOLDER', '/usr/src/app')
        fnames = folder + '/test_sub.tif'
        fr = 30
        num_frames_init = 2000
        num_frames_total = 30000
        offline_batch = 100
        batch = 100
        flip = False
        detrend = False
        dc_param = 0.9995
        do_deconvolve = True
        ms = [3, 3]
        center_dims = None
        hals_movie = 'hp_thresh'
        n_split = 1
        nb = 1
        trace_with_neg = True
        lag = 5

        options = {
            'fnames': fnames,
            'fr': fr,
            'mode': mode,
            'num_frames_init': num_frames_init,
            'num_frames_total': num_frames_total,
            'offline_batch': offline_batch,
            'batch': batch,
            'flip': flip,
            'detrend': detrend,
            'dc_param': dc_param,
            'do_deconvolve': do_deconvolve,
            'ms': ms,
            'hals_movie': hals_movie,
            'center_dims': center_dims,
            'n_split': n_split,
            'nb': nb,
            'trace_with_neg': trace_with_neg,
            'lag': lag
        }

        mov = cm.load(fnames, subindices=range(num_frames_init))
        fnames_init = fnames.split('.')[0] + '_init.tif'
        mov.save(fnames_init)

        print(tf.test.gpu_device_name())

        caiman_file = run_caiman_init(fnames_init, pw_rigid=True, max_shifts=ms, gnb=nb, rf=15, K=4, gSig=[3, 3])

if __name__ == "__main__":
    main()

Thank you for your time

Kaiwen Guo

pgunn commented 1 week ago

I'm not that familiar with fiola; I've heard of it but never used it.

Can you start debugging by trying to run the notebooks included with caiman? Knowing if those work will help us figure out where the issue starts.

kg3354 commented 1 week ago

Hi Pat,

I tried debugging using the caiman notebooks, and different errors rises.

The commands I followed are

mamba create -n gpu5 -c conda-forge caiman
conda activate gpu5
caimanmanager install

I was able to pass all the tests in cpu mode. Then i used

conda update opencv 
conda install -c conda-forge pycuda

and set use_cuda to true in demo_pipeline.py. The new error message I get is

ModuleNotFoundError: No module named 'skcuda'

My fix then is to get the newest skcuda package from their github repo, and I used

git clone https://github.com/lebedov/scikit-cuda.git
cd scikit-cuda
pip install .

And when i run the demo_pipeline.py again, it says:

pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/research/caiman_data/demos/general/demo_pipeline.py", line 202, in <module>
    main()
  File "/home/research/caiman_data/demos/general/demo_pipeline.py", line 85, in main
    mc.motion_correct(save_movie=True)
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/site-packages/caiman/motion_correction.py", line 244, in motion_correct
    self.motion_correct_pwrigid(template=template, save_movie=save_movie)
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/site-packages/caiman/motion_correction.py", line 343, in motion_correct_pwrigid
    self.motion_correct_rigid()
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/site-packages/caiman/motion_correction.py", line 288, in motion_correct_rigid
    _fname_tot_rig, _total_template_rig, _templates_rig, _shifts_rig = motion_correct_batch_rigid(
                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/site-packages/caiman/motion_correction.py", line 2844, in motion_correct_batch_rigid
    fname_tot_rig, res_rig = motion_correction_piecewise(fname, splits, strides=None, overlaps=None,
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/site-packages/caiman/motion_correction.py", line 3149, in motion_correction_piecewise
    res = dview.map(tile_and_correct_wrapper,pars)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/research/miniconda3/envs/gpu5/lib/python3.11/multiprocessing/pool.py", line 774, in get
    raise self._value

I also tried to install CUDA first, then use pip to install pycuda, but it gives me the same error.

pgunn commented 1 week ago

Hello, Caiman itself can use the GPU without needing additional software installs; at most you might need to sidegrade the tensorflow versions using conde.

But I see that fiola uses pycuda.

One thing to try is to use conda versions of all these packages when possible - try very hard to avoid using pip, and to use gpu/cuda-enabled versions of all the same packages. That will prevent version incompatibilities.

kushalkolar commented 5 days ago
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle

My guess is this is depcreated.

I'm a bit confused, what is your usecase? scikit-cuda is very old and not maintained AFAIK, last release was ~10 years IIRC. I would be surprised tensorflow v2 and scikit-cuda.

If you want to use FIOLA for online CNMF(E) it does not use scikit-cuda either.

kg3354 commented 5 days ago

Hi Kushal,

I am currently only trying to use GPU on CaImAn's motion correction part of demo_pipeline.py. The scikit-cuda pops up when I installed pucuda via conda and set use_cuda to true. The FIOLA part works well for now, its just that MC on CPU takes a lot of time and CaImAn seems to support GPU for that section.

Im also at NYU btw :)

Thanks, Kaiwen Guo

kushalkolar commented 5 days ago

If you just want gpu motion correction I'd strongly recommend jnormcorre, it's under active development: https://github.com/apasarkar/jnormcorre

The gpu accelerated motion correction in caiman is very old.