use_single_threshold should default to False

mspacek commented 8 years ago

When first starting out, I grabbed an example .prm file from here, which didn't include the use_single_threshold field. That means that it always defaulted to True, as shown in default_settings.py. I think it would be better to have it default to False.

Some channels are noisier than others. Some are completely faulty. If a noisy channel is included in spike detection, by default it currently does two undesirable things: it unnecessarily increases the detection threshold for all other channels, and it causes lots of false positive detections on the noisy channel. In one particular example, I have a channel whose noise level is maybe 2X that of all the others. With use_single_threshold=True, this particular channel was constantly being triggered off of, greatly slowing down detection and clustering. It was quite extreme. For this 20 min recording I ended up with about 1 million detected spikes on the faulty channel, and only a few 10s of thousands on the others. Setting use_single_threshold=False eliminated the problem, without requiring me to exclude the channel from spike detection.

I've heard that there was a similar setting in the Klusters spike sorting pipeline, and that it defaulted to using a different threshold for each channel.

Thoughts?

nippoo commented 8 years ago

The supposed logic with this is the following: assuming a probe with identical gain per site, the only reason for a different standard deviation would be because of natural local variation in firing rate. So, by setting per-channel thresholds, you'd be excluding a larger number of (valid) spikes from a site in a densely-firing area, and including a large number of noise spikes from a site in a very sparsely-firing area.

Essentially it boils down to this: in your experimental setup, do you see a lot of per-channel variation in SNR or gain? In the recordings I've seen from our lab this mostly isn't the case (with the exception of obviously dead/high-noise channels) but I've seen other recordings on different types of probes where this is the case (normally the exception not the rule) and setting use_single_threshold fixes the problem. So, I guess you might want to set it as a default in your pipeline's PRM template, but I'm not sure whether it should be a global default?

mspacek commented 8 years ago

Hi Max. Hm, interesting point, except it's not actually standard deviation that's used to measure noise, is it? My understanding is that klusta uses the median (based on Quiroga2004), which should be quite insensitive to spikes, and spike rate, unless the rate is really high.

I wonder if someone's looked at how high the firing rate has to be on a channel before it starts to seriously bias the median-based noise level. Actually, Figure 3 in Quiroga2004 does exactly that. You have to get to a rate of about 40 Hz for the median-based noise level to go up by 10%. Seems like standard deviation based noise is about 8 times more sensitive to spike rate.

kwikteam / klusta

use_single_threshold should default to False #30