nel-lab / mesmerize-core

High level pandas-based API for batch analysis of Calcium Imaging data using CaImAn
Other
60 stars 15 forks source link

progress bar #203

Closed vkonan closed 1 year ago

vkonan commented 1 year ago

I'm fairly new to python and still getting used to the notebook interface, so I apologize if there is a solution to this already that I'm missing:

Is there a way to tell whether a cnmf or motion correction process is running or just hung up? Is there a way to add a progress bar or something to estimate how far in the algorithm process it is in?

Thanks, V

kushalkolar commented 1 year ago

A running mcorr or cnmf item prints stuff as it's processing. You can also monitor your CPU and RAM usage. If your RAM is overflowing that can be a common reason for it to "hang" (it usually just takes a very long time and doesn't really hang). You can reduce number of processes to spawn using the MESMERIZE_N_PROCESSES environment variable, this also reduces RAM usage.

vkonan commented 1 year ago

Thanks! Also, is there a way to use the cluster in mesmerize, akin to cm.cluster.setup_cluster(...) ? I tried this but got an error: df.iloc[-1].caiman.cluster.setup_cluster(...)

kushalkolar commented 1 year ago

Nope you can't directly access that, all available functions under the caiman accessor are these: https://mesmerize-core.readthedocs.io/en/latest/api/common.html

What are you trying to set with caiman.cluster?

vkonan commented 1 year ago

I was just hoping to speed up the graph_nmf or sparse_nmf. Currently, greedy_roi takes 76s, while graph/sparse_nmf take 1800s on the same data.

kushalkolar commented 1 year ago

Yes they are much slower and I'm not sure if initialization is multithreaded so that wouldn't help you. I know one of them is also very RAM intensive.

For sparse_nmf also see this, should be out in the next release: https://github.com/flatironinstitute/CaImAn/pull/1078

EricThomson commented 1 year ago

Cluster should work for dendritic: you can run on patches and then run refit() on initial answer on the full image because often dendrites span very large bits of the FOV. I don't have a ton of practical experience with this, but it is in the demo. The initial run often will have them very spatially restricted to the patches, but then running it on refit (which is fast) will then push beyond those initial estimates. For sparse-nmf I'd drop the number of iterations way down, like even to 20. Not saying it will work, but I'd explore that as just another parameter because it can sometimes converge very fast (and I've seen it get less sparse with more iterations).

I should stress the dendritic analysis has not be super stress-tested, and sparse_nmf has not been used a lot. I'm glad people are starting to push on it. :)

kushalkolar commented 1 year ago

Just note that Eric's comments here are for using caiman directly. If you want to run refit with mesmerize you can set "refit" in the parmas dict to True, it goes outside of "main" (main is for algo params).

params = {"main": {...cnmf params here ...}, "refit": True}
vkonan commented 1 year ago

Thanks to you both for the prompt replies!

Am I correct to assume that the multithreading is not something that will be implemented in mesmerize? If not, I don't see mesmerize being a practical method to process dendritic data.

Would you also recommend the following workflow?

  1. Load data and process mcorr and cnmf (using cluster multithreading) via caiman package.
  2. save estimates.A, .C, .S and .F_dff as separate .csv files (I know you guys are currently trying to find a solution for this in another thread). I might even consider using arrow files... but I need to look into how they can be structured and reconstructed.
  3. load saved files, reconstruct and manipulate data accordingly.
kushalkolar commented 1 year ago

Am I correct to assume that the multithreading is not something that will be implemented in mesmerize? If not, I don't see mesmerize being a practical method to process dendritic data.

Multiprocessing is used, you can set MESMERIZE_N_PROCESSES. I'm just not entirely sure if those initialization options are parallelized, but CNMF is. By default it is n_threads - 1

vkonan commented 1 year ago
  1. Isn't MESMERIZE_N_PROCESSES maxed out as default?
  2. If so, isn't it still slower than using the cluster option on caiman?
  3. I may not be understanding the difference between MESMERIZE_N_PROCESSES and the "cluster" option in caiman. What I do know is that mesmerize's cnmf process takes 1800s and caiman's use of cluster takes 200s for the same input data.

If the mesmerize method is not parallelized, it might explain the longer processing time.

kushalkolar commented 1 year ago

MESMERIZE_N_PROCESSES is used to set n_processes for setup_cluster(), this is where it is used: https://github.com/nel-lab/mesmerize-core/blob/master/mesmerize_core/algorithms/cnmf.py#L46-L57

Are you setting the same number of processes and params for both? Is RAM usage the same during both? What OS are you on (it does work on Linux & Windows as far as I know, haven't looked at this specifically on mac), I wonder if MESMERIZE_N_PROCESSES is not being use for some reason in your case, or if it's a params thing.

vkonan commented 1 year ago

Just checked the n_processes in both methods and they're both 7 (i have 8 cores). I have to retract my statement about the 1800s processing time. The discrepancy seems to have been caused by differences in 'ssub' and 'tsub' params (1 vs 2). They're processing at about the same time. Thanks for catching that, Kushal.

I'm closing this issue because all my questions regarding processing time have been answered. Thanks, all!

kushalkolar commented 1 year ago

Yea you want to watch out for the subsampling params, if your spatial resolution is already at the limit you want to set it to 1.

On Tue, May 23, 2023, 18:33 Vaibhav Konanur @.***> wrote:

Just checked the n_processes in both methods and they're both 7 (i have 8 cores). I have to retract my statement about the 1800s processing time. The discrepancy seems to have been caused by differences in 'ssub' and 'tsub' params (1 vs 2). They're processing at about the same time. Thanks for catching that, Kushal.

I'm closing this issue because all my questions regarding processing time have been answered. Thanks, all!

— Reply to this email directly, view it on GitHub https://github.com/nel-lab/mesmerize-core/issues/203#issuecomment-1560212487, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHXXRHZJQS7ZZO5235JKHDXHU3LFANCNFSM6AAAAAAYIHF3SI . You are receiving this because you commented.Message ID: @.***>