"Bad file descriptor" error during processing

programerkotik commented 1 month ago

Your setup:

Operating System: Ubuntu 20.04.6 LTS
Hardware type and RAM: Intel Xeon Gold CPU 2 processors, 36 cores, 72 threads + 4 NVIDIA cards
Python Version: 3.11.8
Caiman version: 1.10.4
Which demo exhibits the problem (if applicable):
How you installed Caiman: conda pure
Details:

I have been running the CNMFe in a loop for all the data I have. I'm not sure if this is the best approach for batch processing, but it has worked for me for a while:

    fname_new = cm.save_memmap([ci_movie_path], base_name=f'memmap_{save_name}',
                                order='C', border_to_0=0, dview=cluster)

    # load memory mappable file
    Yr, dims, T = cm.load_memmap(fname_new)
    ci_movie = Yr.T.reshape((T,) + dims, order='F')

    # get parameters for cnmfe
    cnmfe_params = get_params_cnmfe(params_mc, gSig=np.array([4, 4]), gSiz= 2*np.array([4, 4])+1,  rf=24, stride_cnmf=8,  min_pnr=7, min_SNR=3.0, SNR_lowest=0.5)

    # run cnmfe
    cnmfe_model = cnmf.CNMF(n_processes=n_processes, 
                        dview=cluster, 
                        params=cnmfe_params, center_psf=True, method_init='corr_pnr', only_init_patch=True)

    cnmfe_model.fit(ci_movie)

However, I started getting the error 'Bad file descriptor' for the memmap files. The file is saved and I can find it in the folder, but it fails to load it because of "Bad file descriptor". I wonder if there are any limitations on how these memmap files are saved that could cause this error?

pgunn commented 1 month ago

Can you get me more context for the error? Perhaps a screenshot or a cut'n'paste of that and the surrounding text?

programerkotik commented 1 month ago

I run this inside the loop:

try:
            cnmfe_model, ci_movie, fname_new, correlation_image, peak_to_noise_ratio, dims, total_components, num_accepted, num_rejected = analyze_data(path, save_dir_docs, fps=FPS, motion_correct=True, plot=True)

        except Exception as e:
            logging.error(f'Error processing {session} {mouse}: {str(e)}')

And the analyze_data is here:

def analyze_data(ci_movie_path, save_dir_docs, fps=30, motion_correct=False, plot=False):
    """
    Analyze calcium imaging data using CNMF-E algorithm.

    Parameters:
        ci_movie_path (str): Path to the calcium imaging movie file.
        save_dir_docs (str): Directory to save the technical documentation files.
        motion_correct (bool): Flag indicating whether to perform motion correction. Default is False.
        plot (bool): Flag indicating whether to plot the rigid shifts. Default is False.

    Returns:
        cnmfe_model: The CNMF-E model object.
        ci_movie: The calcium imaging movie.
        fname_new: The filename of the memory-mappable file.
        correlation_image: The correlation image.
        peak_to_noise_ratio: The peak to noise ratio movie.
        dims: The dimensions of the calcium imaging movie.
        total_components: The total number of components.
        num_accepted: The number of accepted components.
        num_rejected: The number of rejected components.
    """
    logging.info('Analyzing data...')
    # Get the name of the file
    save_name = os.path.basename(ci_movie_path).split('.')[0]

    # Perform setup for caiman
    cluster, n_processes = general_setup()

    # Get parameters for motion correction
    params_mc = get_params_mot_corr(ci_movie_path, frate=fps)

    # Perform motion correction if True
    if motion_correct:

        # do motion correction rigid
        motion_corrector = MotionCorrect(ci_movie_path, dview=cluster, **params_mc.get_group('motion'))
        motion_corrector.motion_correct(save_movie=True)

        # Save motion corrected movie as memory-mappable file
        pw_rigid = params_mc.get('motion', 'pw_rigid')  # Check if motion correction is rigid or piecewise rigid
        fname_mc = motion_corrector.fname_tot_rig  # Get the filename of the motion corrected movie

        if pw_rigid:
            # If motion correction is rigid, calculate the maximum absolute shifts in x and y directions
            border_px = np.ceil(np.maximum(np.max(np.abs(motion_corrector.x_shifts_els)),
                                           np.max(np.abs(motion_corrector.y_shifts_els)))).astype(int)
        else:
            # If motion correction is piecewise rigid, calculate the maximum absolute shifts
            border_px = np.ceil(np.max(np.abs(motion_corrector.shifts_rig))).astype(int)

        border_nan = params_mc.get('motion', 'border_nan')  # Get the border_nan parameter

        border_px = 0 if border_nan == 'copy' else border_px  # Set border_px to 0 if border_nan is 'copy'

        fname_new = cm.save_memmap(fname_mc, base_name=f'memmap_{save_name}', order='C', border_to_0=border_px)  # Save memory-mappable file

    else:  # if no motion correction, just memory map the file
        # save memory file to save_path

        fname_new = cm.save_memmap([ci_movie_path], base_name=f'memmap_{save_name}',
                                order='C', border_to_0=0, dview=cluster)

    # load memory mappable file
    Yr, dims, T = cm.load_memmap(fname_new)
    ci_movie = Yr.T.reshape((T,) + dims, order='F')

    # get parameters for cnmfe
    cnmfe_params = get_params_cnmfe(params_mc, gSig=np.array([4, 4]), gSiz= 2*np.array([4, 4])+1,  rf=24, stride_cnmf=8,  min_pnr=7, min_SNR=3.0, SNR_lowest=0.5)

    # run cnmfe
    cnmfe_model = cnmf.CNMF(n_processes=n_processes, 
                        dview=cluster, 
                        params=cnmfe_params, center_psf=True, method_init='corr_pnr', only_init_patch=True)

    cnmfe_model.fit(ci_movie)

    if plot:
        plot_rigid_shifts(motion_corrector, save_dir_docs, save_name)

    logging.info('CNMF-E analysis completed.')

    quality_params = {'use_cnn' : False,
                    'rval_thr' : 0.85,
                    'SNR_lowest' : 2.0,
                    'rval_lowest' : 0.3,
                    'min_SNR' : 3.0,
                    }

    cnmfe_model.params.change_params(params_dict=quality_params)

    cnmfe_model.estimates.evaluate_components(ci_movie, cnmfe_model.params)

    total_components = len(cnmfe_model.estimates.C)
    num_accepted = len(cnmfe_model.estimates.idx_components)
    num_rejected = len(cnmfe_model.estimates.idx_components_bad)

    # Compute some summary images (correlation and peak to noise)
    gsig_tmp = (3,3)
    correlation_image, peak_to_noise_ratio = cm.summary_images.correlation_pnr(ci_movie[::max(T//1000, 1)], # subsample if needed
                                                                            gSig=gsig_tmp[0], # used for filter
                                                                            swap_dim=False) # change swap dim if output looks weird, it is a problem with tiffile

    cm.stop_server(dview=cluster)

    return cnmfe_model, ci_movie, fname_new, correlation_image, peak_to_noise_ratio, dims, total_components, num_accepted, num_rejected

And the logs I will not share complete because it is too huge file but the last lines before the failure are:

2024-08-14 17:49:07,053 - INFO - Memory mapping
2024-08-14 17:49:07,054 - INFO - Updating Spatial Components using lasso lars
2024-08-14 17:49:10,838 - INFO - thresholding components
2024-08-14 17:49:10,846 - INFO - Computing residuals
2024-08-14 17:49:10,848 - INFO - Updating done in 3s
2024-08-14 17:49:10,849 - INFO - Removing created tempfiles
2024-08-14 17:49:10,851 - INFO - Updating temporal components
2024-08-14 17:49:10,854 - INFO - Generating residuals
2024-08-14 17:49:11,231 - INFO - entering the deconvolution 
2024-08-14 17:49:11,252 - INFO - 5 out of total 9 temporal components updated
2024-08-14 17:49:11,261 - INFO - 7 out of total 9 temporal components updated
2024-08-14 17:49:11,267 - INFO - 8 out of total 9 temporal components updated
2024-08-14 17:49:11,272 - INFO - 9 out of total 9 temporal components updated
2024-08-14 17:49:11,291 - INFO - 5 out of total 9 temporal components updated
2024-08-14 17:49:11,301 - INFO - 7 out of total 9 temporal components updated
2024-08-14 17:49:11,307 - INFO - 8 out of total 9 temporal components updated
2024-08-14 17:49:11,312 - INFO - 9 out of total 9 temporal components updated
2024-08-14 17:49:11,314 - INFO - Returning background as b0 and W
2024-08-14 17:49:11,356 - ERROR - Error processing HBTD2 0848: [Errno 9] Bad file descriptor: '~/Calcium-Imaging-Analysis/data/Oct_2021/HBTD2/miniscope/memmap_2021-10-24-14-08-10_video_d1_400_d2_640_d3_1_order_C_frames_31407.mmap'

I reran it yesterday without making any changes and at some point it worked. So, I am wondering if there could be a server failure related to how the .mmap files are created, saved, and loaded.

I would also be grateful if you have any recommendations on how to use them.

pgunn commented 3 weeks ago

Hello, Sorry for the delay in getting back to you; I think probably the best route forward is to add more logging (and maybe try/except blocks) to your analyze_data function to identify where in the function it's failing; there are a lot of ways to do this (from printf logging to returning structured exceptions). In this case the exceptions may be less useful than a backtrace because we don't know the full path to where things went wrong.

flatironinstitute / CaImAn

"Bad file descriptor" error during processing #1388