holgern / pyedflib

pyedflib is a python library to read/write EDF+/BDF+ files based on EDFlib.
http://pyedflib.readthedocs.org/
BSD 3-Clause "New" or "Revised" License
209 stars 121 forks source link

How do you change just one signal to an existing file? #190

Closed DaveMtl closed 1 year ago

DaveMtl commented 1 year ago

Right now I'm:

Is there a way to save just 1 signal to an existing file without having to load it in memory first?

skjerns commented 1 year ago

I'm afraid this is not possible without loading the entire file.

The way edf is saved is block wise. For inserting a signal you would need to insert data into specific time points of the file, and effectively have shift all content to the right, is rewrite the file.

However if you have memory problems you might try opening the new file and loading and saving each channel individually. This way you only have one channel in memory at once. Haven't tested it though. Another tipp: use digital=True for the channels that you do not want to alter. This way the signal stays exactly the same and you don't have any rounding errors.

DaveMtl commented 1 year ago

The way edf is saved is block wise. For inserting a signal you would need to insert data into specific time points of the file, and effectively have shift all content to the right, is rewrite the file.

Couldn't you "just" move the write cursor at the correct position and overwrite the previous signal with the new one without shifting the content? Theorically speaking I mean, I understand the underlying library might not be able to do that.

However if you have memory problems you might try opening the new file and loading and saving each channel individually.

I understand I can use highlevel.read_edf with the ch_names parameter to load a channel individually, but how would I save it individually after?

Another tipp: use digital=True for the channels that you do not want to alter. This way the signal stays exactly the same and you don't have any rounding errors.

How would you save the file using a mix of digital and non-digital signals?

skjerns commented 1 year ago

Couldn't you "just" move the write cursor at the correct position and overwrite the previous signal with the new one without shifting the content? Theorically speaking I mean, I understand the underlying library might not be able to do that.

perhaps, but it's not implemented.

I understand I can use highlevel.read_edf with the ch_names parameter to load a channel individually, but how would I save it individually after? How would you save the file using a mix of digital and non-digital signals?

Sorry, I remembered it wrong, saving channels individually is not possible, even though theoretically it could be implemented. However, saving and loading blocks of data is possible. You cannot mix and match digital and physical samples. But you can read everything as digital and then convert forth and back to physical for the channel you want to alter. However, it should not make a huge difference.

import pyedflib

file_in = './original.edf'
file_out = './out.edf'

f1 = pyedflib.EdfReader(file_in)
chs = f1.getSignalLabels()
signal_headers = f1.getSignalHeaders()
header = f1.getHeader()

f2 = pyedflib.EdfWriter(file_out, len(signal_headers))
f2.setSignalHeaders(signal_headers)
f2.setDatarecordDuration(f1.datarecord_duration)

last_block = False

smp_read = [0 for _ in signal_headers]
# load data block wise
while not last_block:
    data_list = []
    for i, shead in enumerate(signal_headers):
        n_smp = f2.get_smp_per_record(i)
        data_list += [f1.readSignal(i, start=smp_read[i], n=n_smp)]
        smp_read[i]+=n_smp
        if f1.getNSamples()[i]>=smp_read[i]:
            last_block = True
    # insert manipulation of given signal here:
    # .....
    data_list[3] *= -1 # ie. invert signal
    f2.writeSamples(data_list)

del f1
del f2

theoretically this should load only one block of data at a time and save a lot of memory, however it will be a bit slower. I haven't measured the memory requirements, it could be that in the loaded C code the data persists and there is no memory performance increase. Additionally, you are limited to manipulations that are temporally independend, i.e. that work on a sample-by-sample basis.

DaveMtl commented 1 year ago

Yes, this is good, I think I can work with that, thx!

skjerns commented 1 year ago

PS: This code only works in in the unlreased main branch version. Install it via pip install git+https://github.com/holgern/pyedflib