kouichi-c-nakamura commented 1 year ago

Here I summarise what I've learned so far about photometry signal processing. This can include mistakes.

$$ \frac{\Delta F} {F} = \frac{F - F_0}{F_0} $$

where $F_0$ is baseline signal.

1 simplest $\frac{\Delta F} {F}$

In the simplest case, $F_0$ can be a constant value from baseline recording.

$$ \frac{\Delta F} {F} = \frac{F - F_0}{F_0} $$

is simiply normalising the signal change from baseline $F - F_0$ to the baseline $F_0$

The baseline is 0 ($\because \Delta F = 0$).

A value of 1 (100%) means that fluorescence is twice as high as the baseline.

A value of -1 (-100%) means a negative deflection with the size of the baseline (the absolute signal is 0)

2. Using isosbestic wavelength

In reality, you almost always see a decay of signal over time due to photobleaching.

People often use a different wavelength of light, called isosbestic wavelength, at which fluorescence signal doesn't change in response to the activation of the fluorescent sensor.

As GCaMP is approximately isosbestic at 405 nm (i.e. fluorescence is independent of calcium concentration) this gives a calcium sensitive signal and a calcium insensitive signal that can be used to control for movement artefacts5. Akam T, Walton ME (2019) pyPhotometry: Open source Python based hardware and software for fiber photometry data acquisition. Sci Rep 9:3521. https://doi.org/[10.1038/s41598-019-39724-y](https://doi.org/10.1038/s41598-019-39724-y)

GuPPy is programmed by default to use a control isosbestic wavelength recorded by users to remove smaller movement artifacts as well as photobleaching artifacts when calculating the ΔF/F (as described in Lerner et al.[3](https://www.nature.com/articles/s41598-021-03626-9#ref-CR3); Fig. 2). The isosbestic wavelength includes artifacts, but not calcium-dependent events (in the case of a calcium sensor), and so can be subtracted out from the signal and used to normalize the data to determine changes from baseline fluorescence (ΔF/F). Users who did not record an isosbestic channel, or otherwise wish not to use it in their analysis, can still use GuPPy. If there is no isobestic control channel input, the code generates a control channel by smoothing the data trace and fitting an exponential to approximate bleaching effects, which can be subtracted out from the signal. In either case, after a control channel is established, a fitted control channel is obtained by fitting the control channel to signal channel using a least squares polynomial fit of degree 1. ΔF/F is computed by subtracting the fitted control channel from the signal channel, and dividing by the fitted control channel: Sherathiya VN, Schaid MD, Seiler JL, Lopez GC, Lerner TN (2021) GuPPy, a Python toolbox for the analysis of fiber photometry data. Sci Rep 11:24212. https://doi.org/[10.1038/s41598-021-03626-9](https://doi.org/10.1038/s41598-021-03626-9)

To calculate ΔF/F, a least-squares linear fit was applied to the 405 nm signal to align it to the 490 nm signal, producing a fitted 405 nm signal that was used to normalize the 490 nm as follows: ΔF/F = (490 nm signal − fitted 405 nm signal)/fitted 405 nm signal. Lerner TN, Shilyansky C, Davidson TJ, Evans KE, Beier KT, Zalocusky KA, Crow AK, Malenka RC, Luo L, Tomer R, Deisseroth K (2015) Intact-brain analyses reveal distinct information carried by SNc dopamine subcircuits. Cell 162:635–647, https://doi.org/[10.1016/j.cell.2015.07.014](https://doi.org/10.1016/j.cell.2015.07.014)

$F_0$ would be "fitted control", i.e. linear regression (or other fitting methods)-based estimate of signal from the channel for isosbestic wavelength.

$$ \frac{\Delta F} {F} = \frac{F - F_0}{F_0} $$

in this case, both $F$ and $F_0$ will be a vector the same length. And you can do element-wise subtraction and division.

On average, $\frac{\Delta F} {F}$ will be around 0, because $F$ should fluctuate around $F_0$.

eg. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790813/#F5

dLight1.2's isosbestic point is 405 nm according to this paper. However, it seems that they just took someone else's data for GCaMP sensors. Robinson JE, Coughlin GM, Hori AM, Cho JR, Mackey ED, Turan Z, Patriarchi T, Tian L, Gradinaru V (2019) Optical dopamine monitoring with dLight1 reveals mesolimbic phenotypes in a mouse model of neurofibromatosis type 1. Elife 8:e48983, https://doi.org/[10.7554/eLife.48983](https://doi.org/10.7554/eLife.48983)

3. Using control fluorescent repoter

In our case, instead of using an isosbestic wavelength, we use a control fluorescent reporter, tdTomato.

Thomas Akam's justification for his method can be found here:

https://github.com/ThomasAkam/photometry_preprocessing/issues/1

The two fluorescent proteins may not bleach at the same rate.

In his implementation, and also in ours,

The dF is just the motion corrected signal plotted above. The baseline fluorescence F changes over the course of the session due to photobleaching, and is just the baseline we estimated with our double exponential fit. https://github.com/ThomasAkam/photometry_preprocessing/blob/master/Photometry%20data%20preprocessing.ipynb

$$ \frac{\Delta F} {F} = \frac{F_{corrected}}{F_0} $$

$F_0$ is the exponential decay fit of bleaching.

$F{corrected}$ is motion-corrected signal: $F{corrected} = F - F_{motion \space estimate}$

Because $F_{corrected}$ should fluctuate around 0, $\frac{\Delta F} {F}$ also fluctuate around 0.

In our case,

$$ \frac{\Delta F} {F} = \frac{F - F_{motion \space estimate}}{F_0} $$

where $F_0$ is low-pass filtered slow component of $F$.

https://github.com/juliencarponcy/trialexp/blob/d42df8479ff7e4544c4e75a9d799198a61c4648a/trialexp/process/pyphotometry/utils.py#LL69C1-L72C1

https://github.com/juliencarponcy/trialexp/blob/d42df8479ff7e4544c4e75a9d799198a61c4648a/trialexp/process/pyphotometry/utils.py#L89

teristam commented 1 year ago

I think there are several critical issues

The two fluorescent proteins may not blench at different rates, which means the linear relationship between them may change over time. That means the linear fit between them will not be accurate and we cannot remove the motion artifact accurately. This problem is somehow mitigated if we separate the signal into shorter chunks and do the linear fit multiple time, but still not perfect
So I guess we should probably remove the bleaching effect from both signals before we do the linear fit
From our experience, we know that some movement in the red channel is not entirely due to motion artifacts but other factors (e.g. a shadow passing above the optical fibers). And the red channel is much more sensitive to this kind of artifacts than the green channel. These events should not be very frequent, and the linear fit should disregard them when the analysis window is long enough, but still something that we should keep in mind

kouichi-c-nakamura commented 1 year ago

I agree with all three points.

kouichi-c-nakamura commented 1 year ago

Smooth join

I've noticed that chunks (30 s, 50% overlap) are not connected smoothly, and sometimes looks like a staircase. This was expected.

2023-05-26 11 17 17

Instead of averaging the two values for the same data points in two chunks, we can mix them with linearly increasing/decreasing weights. So the contribution of a chunk is minimum (~0) at the end of the chunk but the maximum (1.0) at the center (see the illustration). Already been implemented (will make a branch and Pull Request) and this method seems better.

2023-05-26 11 14 02

Strangely small motion artefact estimates for some consecutive chunks

At the moment I cannot explain this phenomenon because both red and green channels look totally normal and we still have this problem. Note that the signal amplitude is strangely too small for 70 s in top trace (estimated motion). This is not a problem of just one chunk (30 s long), but apparently happening to 3 consecutive chunks in this particular case.

Let me know if you have any ideas. @teristam

2023-05-20 20 52 47

kouichi-c-nakamura commented 1 year ago

Maybe

Denoising by low-pass filtering
Using low-pass (0.001 Hz?)-filtered ultraslow component as $F_0$, process $\frac {\Delta F} {F} = \frac{F - F_0}{F_0}$ of both channels to achieve:
1. flat baseline around 0
2. amplitude normalized to $F_0$
linear regression on chunks
- Should work better in this order.
- If $\frac {\Delta F} {F}$ does a good job, using chunks may not be necessary.
- Intercept should be almost 0 because both red and green channels are now around 0
- Do we still see strangely small slopes???
combine the chunks of fitting results

kouichi-c-nakamura commented 1 year ago

Another idea.

Use non-overlapping chunks; $c_1, c_2, ...$
When using the slope $a_k$ and intercept $b_k$ for the chunk $c_k \space (k = 2,3,...)$, use a constant $\alpha \space (0 < \alpha < 1)$ and $ak' = \alpha \cdot a{k-1} + (1 - \alpha \cdot ) a_k$ and $bk' = \alpha \cdot b{k-1} + (1 - \alpha \cdot ) b_k$
If the $\alpha$ is close to 1, you almost ignore the slope and intercept from the current chunk and use the ones before.
If the $\alpha$ is close to 0, you almost ignore the influence from the slope and intercept from the previous chunk and use the current ones
If the $\alpha$ is 0.5, you average the values from this chunk and the previous chunk to make the change smoother, preventing sudden a change
This method can't handle the staircase problem at the joints, but mixing or averaging the parameters of the two consecutive chunks should make the gap at each step smaller.
Can be combined with the linear window for mixing at the edge???

kouichi-c-nakamura commented 1 year ago

It's been confirmed that the decays of red and green channels take different time courses.
Given that red and green signals are not strictly moving together, but only "roughly", it may be worth trying linear regression on chunks after applying a low-pass filter of ~5 Hz or ~10 Hz. The blurred filtered curves may be easier to fit together. Then the slope and intercept can be applied to the signal immediately before that low-pass filtering.
At the level of waveform average, it really seems that the red signal can be subtracted from the green signal. Note that, in the average waveform, curves are smoothed by averaging itself. Smoothing may be the key to fitting.