holgern / pyedflib

pyedflib is a python library to read/write EDF+/BDF+ files based on EDFlib.
http://pyedflib.readthedocs.org/
BSD 3-Clause "New" or "Revised" License
209 stars 121 forks source link

BDF Scaling Factor #257

Closed GABowers closed 1 week ago

GABowers commented 1 week ago

Consider the following script:

import numpy as np, pyedflib

if __name__ == "__main__":
    path_bdf = 'F:/temp/example2.bdf'

    f = pyedflib.EdfReader(path_bdf)
    n = f.signals_in_file
    signal_labels = f.getSignalLabels()
    ints = np.zeros((n, f.getNSamples()[0]), dtype=np.int32)
    floats = np.zeros((n, f.getNSamples()[0]), dtype=np.float64)
    for i in np.arange(n):
        ints[i, :] = f.readSignal(i, digital=True)
        floats[i, :] = f.readSignal(i)
    print('{} vs. {}'.format(ints[0,0], floats[0,0]))

    print('factor: {} nV/bit'.format(ints[0,0] / floats[0,0]))

example2.zip

I pull in a BDF using pyedflib, then compare digital and floating point signals to ascertain the gain/scaling factor.

The output on my machine is:

-8303035 vs. -259469.84869115643 factor: 31.99999939061511 nV/bit

However, looking here one sees that the Biosemi conversion factor is 31.25 nV/bit.

Is the 32 nV/bit scaling factor used here derived from some other source?

skjerns commented 1 week ago

The conversion factor is saved within the EDF file itself, not edflib or pyedflib and is calculated from the digital_min/max and physical_min/max. So something must go wrong at the point of writing the signal.

Physical values / gain is calculated by the formula

# dmin=-8388608 dmax=8388607 pmin=-262144.0 pmax=262143.0
# signal = -8303035 
m = (pmax-pmin) / (dmax-dmin)
b = pmax / m - dmax
physical = m * (digital_signal + b) # -259469.84869115643

PS: The file is not correctly formatted, as indicated by EDFBrowser. But that doesn't seem to be the issue. image

GABowers commented 1 week ago

Regarding the formatting, I recorded that file myself on an ActiveTwo system with ActiView yesterday so I can't imagine what could cause that.

But for my main issue - I think part of my confusion came from the fact that that original Biosemi link mentions 262144 and +2^23 (8388608) as the physical and digital max, respectively--but as you say this is not reflected in the file header.

I see the same thing here. Seeing the EDF header alongside, however, leaves me unsure about 262143 as the physical maximum: if the gain is truly 31.25 nV/bit (0.03125 mV/bit), shouldn't the physical max be 262144 - 0.03125 ? I wonder if whoever wrote that page simply did "whole number -1" because that's how the digital values work...however all of that of course doesn't have anything to do with pyedflib, so I may need to track it down elsewhere.

Forgive my ignorance, but it's not clear to me what b = pmax / m - dmax is doing in your formula - can you explain the purpose of that part and why the final line isn't simply m * digital_signal?

skjerns commented 1 week ago

b is the offset for when dmin!=-dmax, else digital=0 would always translate to physical*0, but that is not always the case. e.g. there are amplifiers that output 0-1024