HelmchenLabSoftware / Cascade

Calibrated inference of spiking from calcium ΔF/F data using deep networks
GNU General Public License v3.0
115 stars 31 forks source link

Request for model training #43

Closed laura300 closed 7 months ago

laura300 commented 2 years ago

Hi Peter, Thanks for developing this tool, it's really useful! Would it be possible to have to train a model for Excitatory and inhibitory cells combined at 5Hz? I am also wondering since we have been using GCaMP7f and this is not included in the ground truth, do you have any idea of the expected performance? Thanks!

Laura

PTRRupprecht commented 2 years ago

Hi @laura300,

Thanks for the feedback!

Training a model for excitatory and inhibitory cells combined would in theory be possible but I would not recommend it (but you can still convince me otherwise!). The model trained on both datasets would try to find a compromise and therefore result in sub-optimal predictions for excitatory neurons. I have done this upon a question of a reviewer of our paper, but we did not include this result in the paper.

Instead, I would rather recommend to train a model for excitatory and another model for inhibitory cells. If you think that this is a good idea, I can do this for you (the model for excitatory neurons is already available, I would only train the model for inhibitory neurons). The limitation of this approach is that there are only few training data for inhibitory neurons.

Alternatively, one could use the model trained for excitatory neurons for both excitatory and inhibitory neurons, and simply accept that the predictions will be still correlated with inhibitory spike rates (Figure 3a) but off by a large factor (Figure 3c,e).

I hope this helps you to decide what model to choose for your recordings! If not, let me know.

About GCaMP7f: Yes, it is not yet in included in the ground truth. I will include one available dataset in the near future (approx. end of 2022), but I want to do some analyses first. The expected performance is that a model trained on all available datasets (e.g., one of the global_EXC models) will perform equally well on GCaMP7f data. However, due to the shorter rise time of GCaMP7f, the inferred spike rate will be slightly earlier than the true spike rate (this systematic shift will occur for any spike inference algorithm optimized for GCaMP6). This effect is small for GCaMP7f (I think it is clearly less than 100 ms) but a bit stronger for GCaMP8. Together, this effect is not detrimental, but one needs to keep it in mind.

Let me know if you have any other questions!

laura300 commented 2 years ago

Hi Peter,

Thanks a lot for your quick answer and your input on the GCaMP7. I'll follow your advice on using the available model for excitatory cells, I really think that this should work fine. I will eventually have some data with only inhibitory cells so I'll ask you then for the model trained with the inhibitory ground truth. Many thanks!

All the best, Laura

PTRRupprecht commented 2 years ago

Ok, great!

P.S. @laura300: Since your data are sampled a relatively low temporal resolution (5 Hz), maybe check out whether the typical noise levels of your data is covered by the models (default coverage is noise levels between 2 and 9). There was recently an issue (issue #39), where we found that noise levels of the 4-Hz dataset were much higher than 9, and training a model with higher noise levels improved performance a lot. So, just in case you also work with low-signal recordings :-)

nguyemi5 commented 7 months ago

Hi Peter, thank you so much for developing this pipeline! I would like to ask whether there's an update about the performance of your models on jGCaMP7f. I tried using the 'Global_EXC_30Hz_smoothing25ms_causalkernel' model on our data (acquired at 30Hz) and it seems to perform very nicely. However, I guess the absolute spiking rate might be overestimated since the dynamics of jGCaMP7f is faster, is that right? Unfortunately, I don't have any ground truth dataset available to confirm this thought. Do you think it would make sense to rescale the output based just on the ratio of published GCaMP6f to GCaMP7f parameters (time constants, dF/F per AP)? Thank you, Jana

PTRRupprecht commented 7 months ago

Hi Jana @nguyemi5,

That's a good question.

I think your intuition is correct. The overall shape of the inferred spiking should be correct but the absolute spike rate might be overestimated.

Luckily, the GCaMP8 paper includes also a ground truth dataset with jGCaMP7f. I have extracted this dataset and used it to check your question. Below, I plot the average spike rate inferred using the "Global_EXC_30Hz_smoothing25ms_causalkernel" model, and the ground truth spike rate. The spike rate is actually overestimated by a factor of 3-4 (median overestimate of 3.6x): Comparison_GC7f

Of course, take this with a grain of salt, since it is based on only one single jGCaMP7f ground truth dataset with 21 neurons. If you have any follow-up questions, just let me know.

I'm still in the process of analyzing how well the existing models generalize to GCaMP8f/m/s and GCaMP7f. This requires some attention to the detail and therefore took me longer than expected (and I was also busy with transitioning from postdoc to PI ...). The brief analyses like this one can be done quickly, but to bring things together and understand the effects will still take me a few more months. But if you have specific questions that you would find interesting to have analyzed, just let me know.

Best, Peter

nguyemi5 commented 7 months ago

Hi Peter, thanks for such a fast and helpful analysis. It's good to know about these systematic effects and we'll keep these in mind when analyzing our data. Also, congratulations on you transition! Thank you and all the best, Jana