rochefort-lab / fissa

A Python toolbox for Fast Image Signal Separation Analysis, designed for Calcium Imaging data.
GNU General Public License v3.0
30 stars 28 forks source link

All other signals present in the cell ROI are 0 #142

Open scbsli opened 4 years ago

scbsli commented 4 years ago

Hello FISSA developers,

I just tried FISSA on a tiff stack (8000 frames collected at 15 Hz using jRGECO1a) with ROIs defined in ImageJ. In FISSA result, for every ROI, the experiment.result[c][t][1:4, :] are all 0. Is this something normal?

Thanks!

scbsli commented 4 years ago

Here is a pair of raw and de-contaminated traces from one of the ROIs 2020-05-24 16_34_42-vc-server - Remote Desktop Connection

scottclowe commented 4 years ago

Hi @scbsli, thanks for getting in touch with us.

experiment.result[c][t][1:4, :] are all 0

This happens when the signal-separation routine finds that the signal located within the ROI and the signals in its surrounding areas were very highly correlated. So highly correlated that it combines them together as a single output and finds the optimal solution is a single causal signal which is sufficient to explain all the observed signals of both the ROI and its 4 surrounding regions. Hence the entries 1:4 are all zero.

You can stop this from happening by reducing the expected level of sparsity using the alpha parameter of FISSA. For instance, fissa.Experiment(images_location, rois_location, alpha=0.01) will lessen the sparsity constraint, whilst setting alpha=0. will disable it entirely. You should find by reducing alpha or setting it to 0., the recording data will then be split into multiple signals.

I would like to work out why this is happening for you.

I am wondering whether the default alpha value is inappropriate with a very large or small number of samples, for instance. Or whether it is because the alpha value we are using was selected for data with action potentials, but you are studying sub-threshold Ca, etc.

scbsli commented 4 years ago

Hi @scottclowe , thanks for your swift reply! I am going to set the sparsity to 0 and try it again.

Now to your questions -

Thanks for your help!

scbsli commented 4 years ago

Hi, I just tried it again with sparsity set to 0. Now most ROIs have 1 or 2 non-zero neighbouring regions. The corrected traces look more "corrected" now.

Would TIFF cutting to trials beforehand help the algorithm more?

scbsli commented 4 years ago

Now I looked at the trial-based traces, I have a feeling that the algorithm is over correcting.

swkeemink commented 4 years ago

Hi @scbsli,

For investigating the resulting traces for over-correction, I recommend comparing the set of traces obtained before and after neuropil correction. The traces in experiment.raw[c][t][0:5, :] will give you the central ROI and surrounding traces before correction. If all these traces look near identical, this either means there is little contamination and you are only measuring the cell signal, or that there is no cell signal and you are only measuring neuropil (the difference should be fairly obvious by inspecting the original video). In either case, there is little a separation method can do, as it really only receives a single signal. The above can be quickly checked across all the cells by measuring the correlation between the 5 raw signals for each cell, and doing a histogram of that.

If this seems to be the case for many cells: other steps besides changing alpha could be to increase the size of the neuropil regions (with the expansion parameter when defining a FISSA experiment), and reducing the number of neuropil regions (nRegions parameter).

If the above is not the case, there might be something else going on, and we could investigate further.

About trial-cutting: that should not make a difference, as when FISSA receives multiple tiffs for multiple trials, for the separation it just sticks all of them together.

Finally a small technical note: alpha controls both the sparsity across the central and neuropil regions, and the sparsity across time, see the loss function here: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html. I wouldn't recommend setting it completely to zero as it acts as a regularizer (Basically preventing individual extracted components and mixing weights from getting too crazy).

@scottclowe We could implement a check for this, i.e. if the correlation between all regions is really high either don't attempt to separate, or throw a warning.

scbsli commented 4 years ago

Hi @swkeemink, thanks for your reply. Here's the histogram for this test dataset (143 ROIs). It looks like it is skewed to the right. I am not sure if the skew is considered high or it's acceptable? image

swkeemink commented 4 years ago

Thanks for doing that so quickly, and being so open with the analysis! Looking at this I realized that the correlation can actually be rather hard to interpret, as the level of noise will heavily impact the measure. This does look very skewed however. What do the sets of traces look like for the correlations>0.75? (it would be best to look at it for a relatively short time-window, say a single trial).

scbsli commented 4 years ago

Hi @swkeemink thanks for helping us out. These RCaMP recordings are a bit wacky I must say..

Here's an example trace from a ROI that showed between-region correlation at around 0.8. Blue trace is the somatic signal. Vertical black lines denote stim onset and offset.

image

And here's how the cell look on the mean image. image

swkeemink commented 4 years ago

@scbsli Yes, that looks mostly correlated but hard to interpret as you say. Perhaps we should have a look with the smoothed signals (e.g. through low-pass filtering as we do here).

Lacking ground truth data, I have also found that inspecting the frame-by-frame video can help with understanding if we are looking at just background (when everything including the cell fluctuates), or at a cell signal.

swkeemink commented 4 years ago

@scbsli I also just did some testing on my side, and another issue might be that the raw values are so incredibly high, and the fluctuations relatively small. When I artificially reproduced this with some test-data by just adding a very big offset before running FISSA, I got exactly the same problem as you - only a single signal is extracted, unless I set alpha=0 (even though the data pre-offset was easily and correctly separated). This is obviously highly undesired behavior and we will need to fix it. A problem is likely that we are normalizing by the median before running the separation algorithm, as we were assuming that the fluctuations would scale with this offset. I'll need to do some more testing and thinking on this on our side to come up with a principled non-ad-hoc solution. I think we will likely do the following - subtract the median, divide by the variance, and then add an offset to make sure there are no negative values (which is then not guaranteed and tricky). The reason we weren't doing this before is exactly because this can quickly lead to errors with negative values.

In the mean-time it is relatively easy to test this in an ad-hoc way on your side by subtracting say 35000 from the current dataset (if they're all fluctuating around 38000).

We are sorry for these problems by the way, but also many thanks for bringing a new dataset to our attention that highlighted a problem we did not anticipate!

scbsli commented 4 years ago

@scbsli I also just did some testing on my side, and another issue might be that the raw values are so incredibly high, and the fluctuations relatively small. When I artificially reproduced this with some test-data by just adding a very big offset before running FISSA, I got exactly the same problem as you - only a single signal is extracted, unless I set alpha=0 (even though the data pre-offset was easily and correctly separated). This is obviously highly undesired behavior and we will need to fix it. A problem is likely that we are normalizing by the median before running the separation algorithm, as we were assuming that the fluctuations would scale with this offset. I'll need to do some more testing and thinking on this on our side to come up with a principled non-ad-hoc solution. I think we will likely do the following - subtract the median, divide by the variance, and then add an offset to make sure there are no negative values (which is then not guaranteed and tricky). The reason we weren't doing this before is exactly because this can quickly lead to errors with negative values.

In the mean-time it is relatively easy to test this in an ad-hoc way on your side by subtracting say 35000 from the current dataset (if they're all fluctuating around 38000).

We are sorry for these problems by the way, but also many thanks for bringing a new dataset to our attention that highlighted a problem we did not anticipate!

@swkeemink Thanks very much! We'll give that a try.

aleksjr98 commented 3 months ago

@swkeemink I am having a very similar problem where the raw signal has a very high baseline such that only a single signal is being extracted. Have you had any luck with fixing this problem? Or have any suggestions? Thank you!