Splitting and Merging .dat Files

apearson98 commented 2 years ago

Our group would like to cut a portion of time out of a continuous.dat file, and the remerge the beginning and ending of the file. We will then use the remerged file for spike sorting. I saw a recent post about merging two files- could anyone suggest ways of splitting a file by timestamp? Thank you in advance!

jsiegle commented 2 years ago

Hi! I'd recommend using the numpy.ndarray.tofile method for this.

Here's an example of how to load the continuous.dat file using memory mapping:

import numpy as np
import os

input_directory = '/path/to/recording'
stream_name = "example_data"

input_file = os.path.join(input_directory, 'continuous', stream_name, 'continuous.dat')

num_channels = 16

data_flat = np.memmap(input_file, mode='r', dtype='int16')
data = np.reshape(data_flat, (data_flat.size // num_channels, num_channels))

And here's how to write blocks from the beginning and end to a new file:

block1size = 10000  # block size in samples
block2size = 5000   # block size in samples

output_file = '/path/to/output/continuous.dat'

f = open(output_file, "wb")

data[:block1size,:].tofile(f)
data[-block2size:,:].tofile(f)

f.close()

apearson98 commented 2 years ago

Thank you so much for your help! In your experience with this, does the function generation new timestamps for the second block of data, or does it maintain the original? Thanks again!

jsiegle commented 2 years ago

This will only affect the .dat file, not the timestamps.

To get the new timestamps, you can use the following code:

ts = np.load('timestamps.npy')
new_ts = np.concatenate((ts[:block1size], ts[-block2size:]))

open-ephys / analysis-tools

Splitting and Merging .dat Files #100