Q about differences in results using batch and online methods

flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.

https://caiman.readthedocs.io

GNU General Public License v2.0

634 stars 369 forks source link

Q about differences in results using batch and online methods #1059

Closed XiaoqianSun0104 closed 1 year ago

XiaoqianSun0104 commented 1 year ago

Hi,

I used both batch and online methods to process one same tif file. For batch method, I did motion correction, CNMF and seeded CNMF. For online method, I used OnACID and fit_online(). I used nb_view_components to plot fluorescence got from these two methods and I have several questions.

the denoised (red) trace in online result doesn't have negative part, more like denoised version compared to that from batch result.
why y-axis scale is so different, for online, it ranges from 0-0.4, but for batch, it ranges from -20 to 100
the correlation image for batch is much clearer, with less background noise. Does that mean more noise was added to neuron signals?

can you please take a look at it for me? Thank you.

EricThomson commented 1 year ago

Some good questions I'm not sure of the answers to most of them. In terms of negative values for batch processing, there is a useful discussion here that may help, and my initial guess is your offline/online instances are using different values of the bas_nonneg parameter: https://github.com/flatironinstitute/CaImAn/issues/987

In terms of the corr image looking different, my guess is that the online is using fewer frames to calculate/display it so it is noisier, or maybe there is a difference in the vmin/vmax in the display? I would have to look at it, and it will be a few days before I can.

In terms of the scaling, that is more puzzling: I don't run online processing and this is something I need to start looking into over the next few weeks (@j-friedrich might know). (But honestly I'm encouraged that the traces look pretty much the same other than these scaling/thresholding issues :) )

XiaoqianSun0104 commented 1 year ago

Thank you for the reply!

Also, I did some tests today and found that flag use_residuals doesn't work in estimates.detrend_df_f(), with use_residuals=False, I still got noised dff compare with that using utilities.detrend_df_f() without YrA as input.

XiaoqianSun0104 commented 1 year ago

Also, the tif I'm working on contains 11190 frames with size (512, 512) and we expect around 50 neurons. With that, which method do you recommend, batch or online? Greatly appreciate your response.

EricThomson commented 1 year ago

Note in general I'd hesitate to give an a priori answer, but for smallish movies (where "small" is relative to your compute power) I tend to recommend offline, and for larger movies (relative to your compute power), then I would recommend online (with another option being break it up into multiple movies and then do multisession registration, but I think that is a bit anachronistic when the online algorithms are available). If your project is borderline, with both, then I'd probably go with one, which means online, so you don't have to switch back and forth.

Again, this is brainstorming: we are getting beyond pure coding questions into heuristics and recommendations, so others here may have different ideas based on their practical experiences. My practical experience is mostly with smaller movies.

EricThomson commented 1 year ago

Separating out my general comment from that for your movie, if your movie has a bit depth of 8 bits/pixel, then your movie should be around 4 GB. This should be relatively easy to handle by a good workstation with lots of RAM. But the thing to do is just try (with some optimizations if needed).

I am very hesitant to give general remarks here, as what if you have some movies that are four times as long? What kind of computer are you using, etc? This is where you have to dig in and figure out what general workflow will work best, and the thing to do is just try.

XiaoqianSun0104 commented 1 year ago

That makes sense. Much appreciate your response and instructions. For now, RAM is not a problem. I submitted jobs to my institution's computing platform to process data and it took about 40-50 minutes to process one tif, including motion correction, memory mapping, CNMF, seeded CNMF and saving results. I guess I'll continue using offline method. Thank you.

EricThomson commented 1 year ago

Closing because issue seems resolved. Please reopen if this was a mistake, @XiaoqianSun0104