Best practices for concatenating?

cortex-lab / KiloSort

GPU code for spike sorting

GNU General Public License v2.0

175 stars 100 forks source link

Best practices for concatenating? #34

Open micheleacox opened 7 years ago

micheleacox commented 7 years ago

In our lab (as in many others), recording session consists of data split across a series of files. In our case, each file corresponds to a different task/stimulus sequence. Often, we want to look at the response of a given unit across those tasks. It seems like the best way to do this with KiloSort is to concatenate the files before sorting units.

Is this indeed what people are doing? Are there any best practices? For example:

Should data be filtered before or after concatenation assuming there will be some degree of offset between the voltages at the end of one file and the beginning of another (could be negligible or large if a high-frequency low-amplitude event has occurred)?
Should files be concatenated in the order that they were recorded?
Can files that you may not need for analysis be excluded, even if this leaves a large time gap between files?
Any tips as to how to speed the concatenation itself?

Thanks!

marius10p commented 7 years ago

We usually record the entire time. Even if you don't use the data in-between blocks, it will still be useful information for sorting. Definitely concatenate before sorting. Matching clusters post-hoc from different runs is much more tedious.

You could do some filtering to avoid the concatenation artefacts, but unless you do that on the GPU it could be slow. Unless you have thousands of files, I don't think the concatenation artefacts will matter at all. An alternative would be to taper the ends of each file.

The order of concatenation would not matter for sorting, but it would make for nicer visualization in Phy (drifts and such) if it was in the recording order.

I would not exclude any data. Sorting is always better the more data you have recorded, because many neurons fire at low rates.

Make sure you write the concatenated file into a binary file with fwrite. As to how you read the original data, that depends on your format.

Good luck!

HugoMalagon commented 5 years ago

We have also been trying to use KiloSort with split data. However, what we are encountering is that the more pieces of data we concatenate, the more "noise" templates are detected and no spikes are being found. When we run KilerSort on each piece individually, it works perfectly. In fact, if we concatenate 2 pieces, it still works. However, the more pieces we concatenated, the fewer spikes are being detected until it is just noise. During the preprocessing step, KilerSort divides the data into chunks. We have tried by increasing the chunk size but the results were not different. Any suggestions?

Thank you for your time!