AllenInstitute / ecephys_spike_sorting

Modules for processing extracellular electrophysiology data from Neuropixels probes
Other
101 stars 87 forks source link

Concatenating multiple runs #70

Open harshk95 opened 2 years ago

harshk95 commented 2 years ago

Hi, We had an issue with concatenating recordings from different triggers with data acquired with SpikeGLX from a NP1.0 probe. We followed the inline comments in 'sglx_multi_run_pipeline.py' using the fork for SpikeGLX data and had 5 different triggers to concatenate. However, it does not seem that we get the concatenated file from all the runs since the duration is much shorter than expected and we get the following log from catGT.

[Thd 15236 CPU 15 4/04/22 14:24:57.529] Cmdline: CatGT -dir=W:/nobackup/group/user/AH/043_01 -run=concat220327_220401 -g=0 -t=0,5 -prb_fld -prb=0 -ap -lf -ni -apfilter=butter,12,300,10000 -lffilter=butter,12,1,500 -gblcar -SY=0,-1,6,500 -XA=0,0.8,0.2,0 -XA=7,1,0.2,0 -XD=-1,2,500 -XD=-1,4,5 -XD=-1,5,0 -XD=-1,7,5 -BF=0,5,1,5 -dest=W:/nobackup/garber/kanohars/HA/043_01 -out_prb_fld [Thd 15236 CPU 15 4/04/22 14:25:05.387] Skipping tiny content (olap: 204583110, rem: -7417422, bps: 6) file 'concat220327_220401_g0_t1.nidq.bin'. [Thd 15236 CPU 15 4/04/22 14:25:05.467] Skipping tiny content (olap: 205072890, rem: -23291088, bps: 6) file 'concat220327_220401_g0_t2.nidq.bin'. [Thd 15236 CPU 15 4/04/22 14:25:05.547] Skipping tiny content (olap: 205012470, rem: -7134180, bps: 6) file 'concat220327_220401_g0_t3.nidq.bin'. [Thd 15236 CPU 15 4/04/22 14:25:07.937] Skipping tiny content (olap: 216751854, rem: -67511592, bps: 6) file 'concat220327_220401_g0_t5.nidq.bin'. [Thd 15236 CPU 15 4/04/22 15:00:58.431] Skipping tiny content (olap: 74353845720, rem: -2695779240, bps: 770) file 'concat220327_220401_g0_t1.imec0.ap.bin'. [Thd 15236 CPU 15 4/04/22 15:00:58.578] Skipping tiny content (olap: 74531854320, rem: -8464914150, bps: 770) file 'concat220327_220401_g0_t2.imec0.ap.bin'. [Thd 15236 CPU 15 4/04/22 15:00:58.669] Skipping tiny content (olap: 74509896230, rem: -2592825620, bps: 770) file 'concat220327_220401_g0_t3.imec0.ap.bin'. [Thd 15236 CPU 15 4/04/22 15:03:28.883] Skipping tiny content (olap: 78776493950, rem: -24536489670, bps: 770) file 'concat220327_220401_g0_t5.imec0.ap.bin'. [Thd 15236 CPU 15 4/04/22 15:06:30.597] Skipping tiny content (olap: 6196154580, rem: -224648270, bps: 770) file 'concat220327_220401_g0_t1.imec0.lf.bin'. [Thd 15236 CPU 15 4/04/22 15:06:30.688] Skipping tiny content (olap: 6210988630, rem: -705409320, bps: 770) file 'concat220327_220401_g0_t2.imec0.lf.bin'. [Thd 15236 CPU 15 4/04/22 15:06:30.757] Skipping tiny content (olap: 6209159110, rem: -216068930, bps: 770) file 'concat220327_220401_g0_t3.imec0.lf.bin'. [Thd 15236 CPU 15 4/04/22 15:06:40.864] Skipping tiny content (olap: 6564708150, rem: -2044707280, bps: 770) file 'concat220327_220401_g0_t5.imec0.lf.bin'.

This is what the folder with the run looks like - image

We noticed in the documentation of CatGT there is a mention of supercat and were wondering if this is the command that is run. Thanks!

jsiegle commented 2 years ago

@jenniferColonell any advice on this?

jenniferColonell commented 2 years ago

Hi @harshk95 Sorry it took me so long to get around to answering this question! These errors show that the start times in the metadata files are not consistent with consecutive trials -- the negative values for 'rem' indicate that the calculated end time of the concatenated file comes AFTER the end of the file it is trying to add. I'm guessing from the errors that these were actually just independent recordings (that is, not collected as trials) that you need to concatenate. Indeed, supercat, which just concatenates recordings end to end is the correct CatGT command.

PathwayinGithub commented 9 months ago

@jenniferColonell Hi! I use your modified edition for spikeglx and want to concatenate bin files. But the bin files are not of different trials seprated by triggers. When we recorded, sometimes the spikeglx would crash because of the disk writing problem, so we started recording again(independent recordings). I changed the names of these bin files to be t0~n and set t as 0,n to run the pipline. But CatGT just created a bin file that the size is same to the last recording (xxx_g0_tn.imec0.bin file). Why? Can I use the pipline concatenate them?

jenniferColonell commented 9 months ago

Hi @PathwayinGithub The specific problem you are seeing probably has to do with paths. However, for correct concatenation across multiple runs, you'll need to use the supercat feature in CatGT; for multiple streams, make sure you include the -supercat_trim_edges option. I haven't implemented this in the pipeline because it's a less common case, but I can help you with getting writing the appropriate .bat files if that's useful. The basic procedure (see the CatGT Readme for details) is: (1) Run CatGT on your individual runs, to do filtering, artifact removal, and edge finding. You can write a .bat script in windows to process all your runs. (2) Run CatGT with the -supercat feature and -supercat_trim_edges option to concatenate the runs for all your data streams (e.g. imec probes and NI) (3) Run the pipeline using a script based on sglx_filelist_pipeline.py, which skips CatGT and runs sorting + the other modules. (4) Run TPrime with a batch script. There's good instructions in the TPrime readme, but I'm happy to help with that also.

By the way, what kinds of disk writing problems are you having? Are your disks filling up? If you are running multiple probes, you can direct the data streams to different disks to avoid that (this is a feature in SpikeGLX).