flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
630 stars 368 forks source link

Using CUDA but no improvement #393

Closed HunterTom94 closed 5 years ago

HunterTom94 commented 5 years ago

For better support, please use the template below to submit your issue. When your issue gets resolved please remember to close it.

Sometimes errors while running CNMF occur during parallel processing which prevents the log to provide a meaningful error message. Please reproduce your error with setting dview=None.

If you need to upgrade CaImAn follow the instructions given in the wiki.

In the pipeline_demo.ipynb, I added a line ‘use_cuda’: True, when creating the parameter object.

I then tested the run time of the motion correction. I found that there is no change of time whether or not I add the use_cuda line. I am using the default sample video and the result is always 2m16s.

I noticed that you said "The CUDA codepaths will only be active if the needed libraries are installed on your system." in https://github.com/flatironinstitute/CaImAn/blob/master/README-cuda.md However, I do not understand what the needed libraries are in this cases.

I am wondering if I can actually use CUDA to boost the speed of motion correction.

Thank you very much!

pgunn commented 5 years ago

Hi, Wanted to get back to you and let you know we saw the issue; some of us are presently at a conference, but we'll be looking into this soon.

HunterTom94 commented 5 years ago

@pgunn Thank you very much!

epnev commented 5 years ago

@HunterTom94 Note that if you initially run your code and then you change a parameter you will need to restart your local cluster for your changes to take effect. You can do that by setting

dview.terminate()

and then again restart it.

HunterTom94 commented 5 years ago

I reboot the computer and started from the beginning.

When I ran dview.terminate() after the comment line saying "#%% start a cluster for parallel processing (if a cluster already exists it will be closed and a new session will be opened)" I got error message that dview is not defined. So I guess I am starting a new cluster from scratch.

I also add the line ‘use_cuda’: True, after the line ‘use_cnn’: True,

I then timed the cell %%capture

%% Run piecewise-rigid motion correction using NoRMCorre

mc.motion_correct(save_movie=True) m_els = cm.load(mc.fname_tot_els) border_to_0 = 0 if mc.border_nan is 'copy' else mc.border_to_0

maximum shift to be used for trimming against NaNs

I still don't get improvement on run time

epnev commented 5 years ago

@HunterTom94 Most likely, your installation fails to use the GPU and reverts back to the standard method. It's unclear why at this moment. The usage of CUDA is experimental at the moment so we cannot provide a lot of support.

Is your issue #373 still a problem or can we close it?

HunterTom94 commented 5 years ago

Ok, thank you very much. I just closed #373

epnev commented 5 years ago

@HunterTom94 Thanks. Are you seeing this message when trying to use cuda? This will be an indication that cuda is not being used.

Another thing you can try is typing the command nvidia-smi -l 5 in a terminal window before starting the motion correction. This will show the GPU utilization and you can see whether GPU usage goes up during motion correction.

HunterTom94 commented 5 years ago

I did not see the message, at least not under the cell that performs motion correction. The only message I see is that

108221 [motion_correction.py:motion_correction_piecewise():2544] [3572] MOVIE NOT SAVED BECAUSE num_splits is not None

and I also checked activity from nvidia-smi -l 5 During the run of the motion correction cell, most of the time volatile GPU util is around 1%, sometimes it goes up to 10% or 20%. However, the processes shown only contains

1325 G /usr/lib/xorg/Xorg 1364 G /usr/bin/gnome-shell 1625 G /usr/lib/xorg/Xorg 1757 G /usr/bin/gnome-shell

which I think has nothing to do with motion correction.

In order to double check that I did set the parameters correctly,

I have an output like this

29633 [params.py: set():778] [3572] Changing key use_cuda in group motion from False to True

Also, when creating the motion correction object, I used the line

mc = MotionCorrect(fnames, dview=dview,**opts.get_group('motion'))

HunterTom94 commented 5 years ago

When starting a cluster, does the 'backend' variable need to be set up as a special value in order to use CUDA? I am currently using

if 'dview' in locals(): cm.stop_server(dview=dview) c, dview, n_processes = cm.cluster.setup_cluster( backend='local', n_processes=None, single_thread=False)

epnev commented 5 years ago

This is fine. I cannot say what is wrong at the moment. I suggest you use the standard version which is still fairly fast.

HunterTom94 commented 5 years ago

@epnev Hello, the previous discussion was based on running pipeline_demo. However, when I run demo_motion_correction.ipynb I do see the message

1272231 [motion_correction.py: init():194] [18387] pycuda is unavailable. Falling back to default FFT.

I then did conda activate caiman pip install pycuda

The message still appears.

epnev commented 5 years ago

The message means that for some reason GPU is not used during motion correction. Is there a particular reason you want to use GPU so bad? The speed up you get is not that substantial and for short datasets it can actually be considerably slower.

HunterTom94 commented 5 years ago

@epnev Thank you for your reply. The sample data we sent you previously was spatially down sampled for the purpose of uploading to network drive. Our typical file is 10241024 (pixels) 1200 (frames) and is 2.5Gb. A rigid correction takes about 1min 30sec, but a non-rigid correction takes 20 mins for each. So if GPU can accelerate the speed of non-rigid correction, it would be very helpful.

Below are the screenshots for the parameter I am using and the result of motion correction. I am not sure how to understand the non-rigid motion correction graph, but based on the result, can you maybe give us some advice on whether non-rigid correction is necessary in our case?

https://drive.google.com/file/d/1_Xp1J7EinrsdaUfSM7paaUhefbX8XTPa/view?usp=sharing https://drive.google.com/file/d/1kEYITbii9wGVyH8TKAG64ZX88eRBs-DA/view?usp=sharing https://drive.google.com/file/d/19oR-b5OO0HDOL6rY8J3LMf9dMdHskn_O/view?usp=sharing

Thank you very much for your help!

epnev commented 5 years ago

@HunterTom94 The plots show that different patches can have different shifts which is indicative that non-rigid motion correction might be necessary. However, I suggest you look at the videos of the motion corrected movies with either rigid or non-rigid correction to judge for yourself.

A parameter you can change is strides which corresponds to how big each patch is. As the demo_pipeline says the strides parameter is in pixels and it can be set such that is corresponds to approximately 100um or a bit more. So if you're imaging with spatial resolution 1 pixel/um you can try setting it to e.g., 128. This will increase the speed of non-rigid motion correction significantly.

You may also want to consider increasing the max_shifts parameter as 6 pixels might be too small.

I suggest you look at the demo_motion_correction notebook which includes all this information to get an understanding of the different methods and parameters.