Open chiyu1203 opened 1 month ago
Hi @chiyu1203
I think that this simply means that the LinearRegression
to fit a specific collision. I think we should protect against this and either use NaN
for the failed spikes or revert to the non-collision option when this happens.
Hey @chiyu1203 thanks for this bug report. The safest option like Alessio said is to make sure that execution doesn't crash if fitting fails.
It would be nice to get a bit more information on the input variables X
, y
that are causing the failure to fit. However, this is not necessary so don't worry if you don't have much time to do the below, it would just add a little background information. Are you familiar with debugging or entering breakpoints? This can be useful to get more information on errors. To break into the code you can:
1) pip show spikeinterface
just to confirm where spikeinterface is installed. From above, it should be installed at: "C:\Users\neuroPC\Documents\GitHub\spikeinterface\src\spikeinterface"
.
2) Next open the script at "C:\Users\neuroPC\Documents\GitHub\spikeinterface\src\spikeinterface\postprocessing\amplitude_scalings.py"
3) at line 530 can you replace the line:
reg = LinearRegression(fit_intercept=True, positive=True).fit(X, y)
with
try:
reg = LinearRegression(fit_intercept=True, positive=True).fit(X, y)
except:
breakpoint()
When you run the script, execution will stop at the crash point and you will be able to interact with the variables. You can print them, but they will probably be very large. However, even printing them may show that something strange is occurring with these input variables (e.g. they have nan
or are empty). Alternatively you can save them to disk np.save("/path/to/save/variable_X.npy", X)
. If you do this for X
and y
and they are not too large you could upload them to dropbox or google drive to share.
Dear @JoeZiminski and @alejoe91
Thank you for your quick response and clear instruction!! I have followed your step and save X
and y
when the execution stopped at the crash point.
variable_X.zip
I have attached them here. Thanks!
Thanks @chiyu1203! Indeed the matrix X
has two columns that are both identical. X^TX is not invertible, which is required for computation of OLS.
The template (y) and traces are shown below. I'm trying to look a bit deeper into the code to understand why this might occur. Wrapping the LinearRegression
will definitely handle this but I wonder if it is indicative of any other problems. @alejoe91 does the problem seem to be something that could be expected to occur randomly from time to time and handled at this level? Or might it be an edge case that could be caught higher-up the processing chain?
@alejoe91 @samuelgarcia I understand a bit more what the code is doing, this must mean that somehow there are two spikes, from the sample template, occurring at exactly the same time (?), that were assigned as a collision in find_collisions
? I guess this could happen if a single spike was somehow split in two, and registered to the same unit?
Maybe the easiest thing to do, is when building X
, either check collisions
contains no duplicates, or if the template to be added to X
is not already in X
.
@JoeZiminski thank you so much for looking into this!
I agree that the problem is that there are to spikes from the same unit exactly at the same time.
@chiyu1203 to fix this, you can try to run this curation step before creating the analyzer:
import spikeinterface.curation as scur
sorting_deduplicated = scur.remove_duplicated_spikes(sorting)
Let me know if it works!
Thank you for your reply and solution! This method works with me. I still do not know how these duplicated spikes generated though. After I ran spike sorting with kilosort, I manually curated the spikes on phy and merged clusters from the same channel (and same template). Maybe there is a bigger issue about how I ran kilosort?!
@JoeZiminski thank you so much for looking into this!
I agree that the problem is that there are to spikes from the same unit exactly at the same time.
@chiyu1203 to fix this, you can try to run this curation step before creating the analyzer:
import spikeinterface.curation as scur sorting_deduplicated = scur.remove_duplicated_spikes(sorting)
Let me know if it works!
That's great it's working @chiyu1203! Could you please actually re-open this as it would be useful to perform a check when filling the X
matrix that there are no duplicate templates. If there are it will cause strange results in the fitting.
It may also be worth looking into the duplicate spike issue, but my guess is that if there is already a function for it is a sorter-side problem that SpikeInterface has to compensate for with the remove_duplicated_spikes
function. So I doubt it is anything to do with how you performing the sorting / curation, @alejoe91 will be able to confirm though.
Dear Spikeinterface community, I used
sorting_analyzer.compute("amplitude_scalings")
on Spikeinterface 0.101.0 (installing from source).numpy version is 1.26.4 Kilosort 4.0.4 Window11
I first did spike sorting on Kilosort4 and manually curated on phy. Then I created the sorting_analyser with
binary_folder
format, andsparse =True
, and I used the default setting to compute this "ampliteude_scalings". However, it returned this error.Could anyone suggest me what might be the issue and how to fix it? There is no much information about calculating this one so I do not know how to fix it myself. All the best,