HelmchenLabSoftware / Cascade

Calibrated inference of spiking from calcium ΔF/F data using deep networks
GNU General Public License v3.0
114 stars 31 forks source link

df/f scaling variation with laserpower #40

Open yannicko-neuro opened 2 years ago

yannicko-neuro commented 2 years ago

Dear Peter, first of all congrats on this fantastic work. I have run Cascade now on all sorts of different datasets. While it generally seems to perform well I do have a suggestion. Cascade does, at least, implicitly rely on absolute values. IIRC the solution to this problem is dividing the trace by 100 depending on how df/f is calculated. However, in my experience df/f is still dependent on laser power, probably due to different response functions of the indicator and the surrounding tissue. This is partially captured by the noise model. In practice the traces of cells can still be scaled differently depending on laser power (0 to 0.7 with low laser power or 0 to 15 with high laser power, actual values may differ...) while exhibiting a similar noise level. Rescaling with an exponential function before giving it as an input to Cascade seems to work but I am unsure about the reliability. Do you think it would be possible to add next to the noise an exponential function or something similar that varies systematically between certain parameters to the data during training?

PTRRupprecht commented 2 years ago

Hi Yannick,

You are right, Cascade uses the absolute dF/F values to infer spike rates.

Yes, there a couple of factors that influence absolute dF/F values, which is not great for Cascade since it uses absolute values. For example, it is clear that absolute dF/F values slightly depend on the wavelength of the excitation light (see for example here in Fig.4c), because the fluorescence varies differently for bound and unbound GCaMP across wavelengths. But I have no knowledge of an effect of laser power on absolute dF/F values. More laser power would equally increase the fluorescence of GCaMP molecules in the bound and unbound state, and therefore dF/F would remain the same, as far as I understand. There are effects of saturation, but then you would expect lower dF/F values for higher power (and you reported the opposite).

Possible reasons that I could imagine how higher laser power affects absolute dF/F values:

If you are really seeing different dF/F values for the same neuron when imaged first with low and then with high laser power at the same wavelength, I would be quite surprised.

But now to your question on how to address your problem. I did not fully get how you scale the data using the exponential function. Do you just apply a function like a*exp(-x/b), with x being the dF/F data? If so, I do not fully understand why you made that choice. To apply a scaling factor to the ground truth data during training would be possible (for example, for the spikefinder challenge, this has been done to all recordings by z-scoring), but this is a loss of information that I wanted to avoid because it makes the units of the inferred spike rates meaningless. If you think it is a good idea to train such a model with normalized ground truth, I can do this (you can just tell me the framerate and smoothing that you would like to have). However, maybe it is worth first having a look into whether dF/F is really changed by different laser powers or whether something else might be a confound.

In general, there are a couple of (sometimes unknown) factors that might result in dF/F values that are much higher or much lower than in the ground truth that was used to train Cascade. In this case it is reasonable to just scale them to a level that seems normal (maximum values between 1-10 roughly). However, if you apply such a scaling to your data, you have to be very careful when you make any quantitative statements, and if you're computed dF/F values are different across experiments (e.g., due to different laser power, in your case, if this turns out to be true), I would avoid lumping together those datasets or making quantitative comparisons between them.

Hope this helped a bit ... let me know what you think!