Closed taningh86 closed 8 months ago
Howdy @taningh86, this is a bug we see with Kilosort sometimes. it is not saying you have too many spikes it is saying that you have spikes after the recording time has ended. For example if your recording is 60 seconds kilosort for some reason is returning a spike time of 61 seconds. There really doesn't seem to be a pattern for why kilosort does this. (Out of all my recordings it has only happened to me once). The function is here.
But you can do:
import spikeinterface.curation as scur
sorting_wout_excess_spikes = scur.remove_excess_spikes(sorting_KS25, recording_saved)
Hi zm711. Thanks for the quick response. Very much appreciate it. I am glad spikeinterface has a remedy for it.
By the way, is there a reason why I was unable to find remove_excess_spikes()
in the SI documentation?
We might have forgotten to add it! Thanks for noticing, I'll take a look at the docs!
I do believe that this:
and I get the following error ; ValueError: The sorting object has spikes exceeding the recording duration. You have to remove those spikes with the
spikeinterface.curation.remove_excess_spikes()
functionNot sure why this is. Also, how can no of spikes exceed recording duration. Not sure if I am not understanding it right but from the face value of it I am not sure how it is possible to have "excessive spikes" in a recording duration. First, I believe there's no limit set on the source code to how many spikes are allowed in a recording duration and second, spike counts and duration are two different units that cannot be compared. Unless the error means that somehow after sorting the total duration of recording is coming out longer than the actual duration of the raw data? Which, again cannot possibly happen, unless this is a bug.
Indicates that the error message is very ambigious and unclear. The problem comes from the fact that the word spikes is overloaded to mean both the physical object, the index of a vector and the times of spike events. How could we improve this?
@zm711 @taningh86 What do you think of?
ValueError: The sorting object has spikes whose times go beyond the recording duration. This could indicate a bug. To remove those spikes, you can use `spikeinterface.curation.remove_excess_spikes()` function
Would this be clearer?
This could indicate a bug in the sorter
maybe just to add more clarity. Otherwise people will come to the issue tracker and report the "bug" as a spikeinterface bug rather than as a sorter issue with the recording and sorting objects.
But yeah I think you're right that we are overloading terms. So your statement is clearer in that it is not the "spikes" which are the problems, but the "times".
All right. Let's see what OP and @alejoe91 say.
Agree, making a PR!
Actually, the function is in the docs: https://spikeinterface.readthedocs.io/en/latest/api.html#spikeinterface.curation.remove_excess_spikes
I was able to find it from the search tab. @taningh86 not sure why you couldn't find it!
Apparently Sam removed the check in the waveforms
extension. @samuelgarcia a specific reason for that? Maybe this should be done at the SortingAnalyzer
instantiation by default. What do you guys think?
I'm not opposed, but I guess if we deal with it under the hood then we will never track down the root cause (not that we've really searched for the cause). I've never actually checked how far out the excess spikes are... for example if they are just one frame after the recording duration then maybe this is a very slight rounding error. Since this basically never seems to come up for my datasets I haven't really felt the need to explore. I just ran the function and didn't worry about it.
All right. Let's see what OP and @alejoe91 say.
Hi @h-mayorquin I agree with Zach. Adding that 'its is likely a problem of the sorter' would put more clarity. Thanks!
Yes I can add it back in teh create method. But we need to refactor the RemoveExcessSpikesSorting to use the spike vector approach to be faster.
I opened a PR that improves the documentation: https://github.com/SpikeInterface/spikeinterface/pull/2604
I think we should close this after we merge that PR as the other points are related but independent.
My take on the other discussed isssues.
SortyingAnalyzer
.
Do you guys want to check by default or remove by default? Two questions:
1) Do you want to do the check at __init__
? Do you want to cal remove_excess_spikes
by default? One condition on the other?
2) Does this matters for all the extensions?
If the answer to the second is no, then I kind of feel that we don't want to over-burden the super-duper-general SortingAnalyzer
with checks and complexity. Can we do it only for the extensions where it matters?
The opposite approach would be to raise that error when the error actually happens but that goes against the principle of failing at initialization which I think is a good idea for the case of long computations where the error will be at the end.
Hi Ramon,
THanks for checking this
Makes sense to me., @samuelgarcia.
Thanks for clarifying. I think it never clicked on me the purpose (or at least one off) of the create method on the WaveformExtrractor
and the SortingAnalyzer
until now.
This is of topic but I was wondering if it would make to cache the last frames of the sorting objects. This would make the check trivially easier.
Should we close this?
@h-mayorquin from my end I am good with closing it. Thanks!
Hi, This error below is new and never happened before while trying to extract waveforms from sorted data. Below is the code: ``sorting_KS25 = ss.read_sorter_folder(base_folder/"results_KS25") sorting_KS25 recording_saved = si.load_extractor(base_folder / "preprocessed") sorting = sorting_KS25 print(sorting)
we = si.extract_waveforms(recording_saved, sorting, folder=base_folder / "waveforms", allow_unfiltered=True, load_if_exists=False, overwrite=True, **job_kwargs) #Without allow_unfiltered=True it was giving error saying that waveforms are not filtered. This is because I did not run band pass filter earlier due to catGT output files already being bandpass filtered print(we)``
and I get the following error ;
ValueError: The sorting object has spikes exceeding the recording duration. You have to remove those spikes with the `spikeinterface.curation.remove_excess_spikes()` function
Not sure why this is. Also, how can no of spikes exceed recording duration. Not sure if I am not understanding it right but from the face value of it I am not sure how it is possible to have "excessive spikes" in a recording duration. First, I believe there's no limit set on the source code to how many spikes are allowed in a recording duration and second, spike counts and duration are two different units that cannot be compared. Unless the error means that somehow after sorting the total duration of recording is coming out longer than the actual duration of the raw data? Which, again cannot possibly happen, unless this is a bug. Additionally, I searched for "remove_excess_spikes()" function in the documentation and was not successful in finding it. Will very much appreciate a solution to this. This did not happen with recordings of same length or longer from the same probe.
Thanks Jimmy