Julie-Fabre / bombcell

Automated quality control, curation and neuron classification of spike-sorted electrophysiology data
GNU General Public License v3.0
123 stars 30 forks source link

Adjusting bin size in percSpikesMissing for burst units #179

Open ryanlash opened 2 weeks ago

ryanlash commented 2 weeks ago

Hi Julie!

I was wondering if you had any suggestions/recommendations for handling burst units with regards to the calculation of the percentage of missing spikes? Basically, most of our units are being mistakenly marked as MUA since amplitudes decay and the maxBin value is for the most part always equal to 1. I was thinking of adjusting to calculate bin size using a friedman-diaconis method but unsure of its utility in this scenario. Thoughts? Screenshot 2024-11-08 122546 Screenshot 2024-11-08 123944

Julie-Fabre commented 1 week ago

Hi Ryan,

Thanks for your message! I can't really see the amplitudes in this plot (it looks like the plotting default is not working great in this case). Would you be able to zoom in a little on these two plots (or open the unit in phy and screenshot things there) ? I don't intuitively see why the % of missing spikes would fail in this way (over-estimating) for bursty units - it should usually under-estimate the percentage of missing spikes because the spike distribution is non-Gaussian and has a heavy tail toward the lower amplitudes.

image

ryanlash commented 1 week ago

Hi Julie,

Here are some screenshots of the unit. And yes, this unit (like other burst units) are non-gaussian and does skew towards lower amplitudes. What I noticed is if the maxBin value is equal to 1 then there's a loop in the percMisssingSpikes that then marks the pMissing val=50 (which does make sense for non-burst units). I've found by decreasing bin size/increasing number of bins does resolve the issue. Screenshot 2024-11-11 134225 Screenshot 2024-11-11 134203 Screenshot 2024-11-11 134002 Screenshot 2024-11-11 133914

Julie-Fabre commented 1 week ago

Interesting! What bin size / number of bins did you end up setting it to?

ryanlash commented 1 week ago

Setting the bin size to 500 was working. However, I have actually figured out what the issue was. There were outlier spikes from adverse noise events happening for multiple sessions and multiple rodents at the end of each session across all channels affecting every unit (amplitude >1000). This then affected the distribution of bin locations. Because there were so few (<50 each), they weren't appearing in any of the windows in Phy. If you look at the original GUI output you can see them in the amplitude presence window. If anyone else runs into this issue, install the phy2 plugins here and remove outliers using the Mahalanobis distance plugin to avoid spending two weeks down a rabbit hole like me. Science :')

Julie-Fabre commented 1 week ago

Yes, good catch. I can see the spikes aren't displayed properly in bombcell because of these outliers (and then the histogram bins are not the correct size/ values to capture any meaningful amplitude variations to fit a Gaussian to). One solution for you is to remove the outliers in phy, like you said (and thanks for pointing towards those phy plugins) It would be nice to have a solution in bombcell though. I am not sure what the best solution would be- either checking if there are extreme outliers (and removing them) or defining the histogram bins in a 'smarter' way (maybe using the Freedman-Diaconis rule). I will have a think/ play around and see which solution works best. I'll leave this github issue open until then. Thanks again for bringing this up!