kwikteam / phy-contrib

[This repository is archived, will be deprecated after the release of phy 2.0]
30 stars 39 forks source link

Taking into account a temporal lag while merging templates? #54

Closed yger closed 8 years ago

yger commented 8 years ago

Hi all

I have a question following a post that we had on our software https://groups.google.com/forum/#!category-topic/spyking-circus-users/P9Dlp4SCFU0

My question is as follow: would it be possible, when merging templates, to take into account the fact that there might be a temporal lag between two templates? This phenomena appears in our case when we allowed the code to detect positive peaks, as a request from several users. Imagine a biphasic spike, crossing both positive and negative threshold. We might end up with two templates, slightly time shifted, centered on the positive and on the negative peaks. We will think on our side about ways to avoid that, but I was still raising the issue here as it may be relevant for someone else. When merging time-delayed version of templates, with a lag (that could easily be detected as the maximum of the delayed CC), that lag should impact the spike times of the merged cluster... Hope I am clear enough

Best

rossant commented 8 years ago

IIUC, what you want is a way for users to shit by a few time samples all spikes belonging to a given cluster?

yger commented 8 years ago

Yes, this would be a solution, and do the job. Would it be easy to implement ?

Pierre

Le mer. 20 juil. 2016 17:24, Cyrille Rossant notifications@github.com a écrit :

IIUC, what you want is a way for users to shit by a few time samples all spikes belonging to a given cluster?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-233984074, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmE_3ELBaPDx43YNZV-OVUgdWTfIFfjks5qXj2bgaJpZM4JQk9K .

nippoo commented 8 years ago

This biphasic spike issue has come up numerous times with our own software.

Do you really want to merge the two, though? Don't you just want to delete one, or you'll end up with duplicate spike times; and an estimate of neuron firing rate that is 2x too high?

yger commented 8 years ago

No, I don't think that we need to throw one away. This issue, for us, appeared only when we allowed the detection of positive AND negative peaks. The problem with such a possibility, requested by users, is that if you have a biphasic spike, some of the spikes are pooled into a positive template, and some others into the negative one. The two templates are similar, but of course slightly time shifted, and this can sometimes prevent an automatic merging. During the template matching phase, we do not have the spikes twice: they are really shared across the two templates, that you would like ideally to merge. So when you merge those clusters, to be exact, you would need to change the spike times of the deleted cluster by time shifting them, with a lag equal to the time delay between the two templates. Hopefully, I'm clear enough....

Best

Pierre

2016-07-20 18:50 GMT+02:00 Max Hunter notifications@github.com:

This biphasic spike issue has come up numerous times with our own software.

Do you really want to merge the two, though? Don't you just want to delete one, or you'll end up with duplicate spike times; and an estimate of neuron firing rate that is 2x too high?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234009662, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmE_5-uA8nl5A4WWwuoQQjyNdTHBvAoks5qXlHKgaJpZM4JQk9K .

nippoo commented 8 years ago

Yes, you are clear! So you never have the situation where a biphasic spike is split into two templates - a positive and a negative one? This was the case for SpikeDetekt/KlustaKwik.

Assuming the software is clever enough not to detect biphasic spikes as two separate monophasic spikes, then that's great!

nsteinme commented 8 years ago

Pierre could you clarify, is this anything other than an aesthetic issue (after merging you have the appearance of multiple waveforms all overlaid as if the cluster needs to be split, even though you know it doesn't)? I don't think it's a terrible idea even if it is just aesthetic, but I'm just trying to figure out whether there is a scientific point that I don't see.

On Wed, Jul 20, 2016 at 6:54 PM, Max Hunter notifications@github.com wrote:

Yes, you are clear! So you never have the situation where a biphasic spike is split into two templates - a positive and a negative one? This was the case for SpikeDetekt/KlustaKwik.

Assuming the software is clever enough not to detect biphasic spikes as two separate monophasic spikes, then that's great!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234028107, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPUP1Jx_FYJspsE8TFRmEbkVqN1JAalks5qXmDLgaJpZM4JQk9K .

yger commented 8 years ago

Yes, I think this is mostly aesthetic. To make a long story short, If you merge two templates, that are slight time shifted copies, then you end up with waveforms that are not properly aligned. So I think that to be fair, you would need to correct for that, am I right ?

Le mer. 20 juil. 2016 19:58, nsteinme notifications@github.com a écrit :

Pierre could you clarify, is this anything other than an aesthetic issue (after merging you have the appearance of multiple waveforms all overlaid as if the cluster needs to be split, even though you know it doesn't)? I don't think it's a terrible idea even if it is just aesthetic, but I'm just trying to figure out whether there is a scientific point that I don't see.

On Wed, Jul 20, 2016 at 6:54 PM, Max Hunter notifications@github.com wrote:

Yes, you are clear! So you never have the situation where a biphasic spike is split into two templates - a positive and a negative one? This was the case for SpikeDetekt/KlustaKwik.

Assuming the software is clever enough not to detect biphasic spikes as two separate monophasic spikes, then that's great!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234028107>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AHPUP1Jx_FYJspsE8TFRmEbkVqN1JAalks5qXmDLgaJpZM4JQk9K

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234029349, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmE_8amUFjCCzL0XmdWMmveLFgaoEfxks5qXmG5gaJpZM4JQk9K .

nippoo commented 8 years ago

@rossant the only complexity with this I can see is having to cache invalidate the existing waveforms and re-fetch them from the .dat... and also being able to move by non-integer samples...

rossant commented 8 years ago

The entire program was built under the assumption that the spike times are fixed. Another assumption is that for a given cluster id, the data (waveforms, CCGs, etc.) will always be fixed. Any change in the spikes or in the data requires a new id. The main reason is that we use function memoization for caching cluster-related data.

If we allow spike times to be modified, we change these assumptions, so we'll have to carefully review and upgrade the code accordingly to make sure we don't introduce subtle bugs. It's definitely doable but it's not something we should do lightly.

yger commented 8 years ago

OK, that is what I thought. However, I wanted to raise this issue, because it seems to me that this is something that could be think of. As said, this problem is not common, as we are doing our best before launching the template matching phase to automatically merge time shifted copies of the templates, but it turns out that it can happen. And I'm happy to hear that you also saw that issue. In phy, the user ends up with two groups of waveforms misaligned, even if they clearly originated from the same template. The time shift is always not large, only few time samples, so merging spike times without taking the lag into account is not dramatic. But this is not as accurate as it could be. Regarding the non-integer sample issue, no need for that. We could force the lag to be an integer, this should be sufficient

Thanks for all the nice work

2016-07-20 21:59 GMT+02:00 Cyrille Rossant notifications@github.com:

The entire program was built under the assumption that the spike times are fixed. Another assumption is that for a given cluster id, the data (waveforms, CCGs, etc.) will always be fixed. Any change in the spikes or in the data requires a new id. The main reason is that we use function memoization for caching cluster-related data.

If we allow spike times to be modified, we change these assumptions, so we'll have to carefully review and upgrade the code accordingly to make sure we don't introduce subtle bugs. It's definitely doable but it's not something we should do lightly.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234063923, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmE_8crE9KTRHRtWfSCBYxYoYH2bbQ7ks5qXn4fgaJpZM4JQk9K .

nippoo commented 8 years ago

@rossant ah yes. Good point. In fact I think one of the more subtle things is that if spike times get shifted, the spike times array will no longer be sorted (a spike could "leapfrog" another) - so they have to be resorted.

rossant commented 8 years ago

@nippoo right! this is an even bigger problem... The fact that spike times are increasing is another assumption that is critical in the computation of the CCGs...

yger commented 8 years ago

Ok, well, I was just asking.... Can't you just re-cache waveforms and recompute CC only for the cluster you are merging and then potentially resorting. You would need to do that only when such a merge with time shift is performed, so this should not be such a big deal... I mean, this is not a crucial thing, but this is something to keep in mind: I'll do my best to try to see how to prevent such duplicates at the dictionary level, but this is a feature raised by this double-peak detection mode....

rossant commented 8 years ago

It's not that simple, for performance reasons we keep a big spike_times array with all spike times across all clusters, sorted. Thanks to this data structure, we have a fast algorithm computing all pairwise CCGs within an arbitrary set of clusters. This algorithm assumes that the spike times are sorted.

Also, every spike has a unique ID (its index in that big array) and we use that information for the lasso/split feature.

Again, we can definitely do this, but it will require a lot of effort and important changes in the code as it will break major assumptions about the data. We need to carefully weigh the pros and cons before doing these changes for a relatively rare feature.

yger commented 8 years ago

Ok, thanks for the explanation. Fine for me, this is clearly not a top priority feature: I agree that optimizations of the existing core are still more important at this stage.

2016-07-21 14:22 GMT+02:00 Cyrille Rossant notifications@github.com:

It's not that simple, for performance reasons we keep a big spike_times array with all spike times across all clusters, sorted. Thanks to this data structure, we have a fast algorithm computing all pairwise CCGs within an arbitrary set of clusters. This algorithm assumes that the spike times are sorted.

Also, every spike has a unique ID (its index in that big array) and we use that information for the lasso/split feature.

Again, we can definitely do this, but it will require a lot of effort and important changes in the code as it will break major assumptions about the data. We need to carefully weigh the pros and cons before doing these changes for a relatively rare feature.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kwikteam/phy-contrib/issues/54#issuecomment-234237587, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmE_2xtcOhSuZt9h9VP9b74pTQ81OC0ks5qX2SQgaJpZM4JQk9K .

yger commented 8 years ago

I'm closing this issue, since I've implemented something in the Meta-merging GUI of SpyKING CIRCUS, to deal with those lags. So users, when performing merges with such a GUI before exporting to phy, will not face this issue anymore. No need to change the structure of phy