`edf_to_onda_samples` didn't return correct sample data as expected

likanzhan commented 2 years ago

In the Dropbox Folder, I uploaded a sample data recorded in our study. We have a data file data.bdf and a trigger data file evt.bdf. Normally the data is processed with Matlab, readbdfdata.m.

For now, we can use the EDF.read() to read the data, and the EDF.decoded data are similar to the results obtained with MatLab (appendix 1)

edf = EDF.read("data.bdf")
dt = [transpose(EDF.decode(signal)) for signal in edf.signals]
data_edf = reduce(vcat, dt)

But the output of the edf_to_onda_samples() function is totally different from that of the supposed results (appendix 2):

edf_samples, nt = edf_to_onda_samples(edf)
data_onda = edf_samples[1].data

What is the problem here?

Appendixes:

data_edf:

64×2034000 Matrix{Float32}:
  -4040.22        -4041.19         -4043.19       …    -6332.75         -6334.94        -6333.69
 -14271.8        -14270.7         -14272.1            -16735.0         -16743.6        -16743.9
 -10168.2        -10167.1         -10168.6            -11401.8         -11400.3        -11397.1
 -20942.8        -20941.9         -20943.9            -21258.4         -21259.2        -21257.4
 -14660.4        -14661.4         -14663.3            -14604.1         -14602.7        -14600.2
 -18451.6        -18448.0         -18448.7        …   -21246.9         -21248.2        -21245.2
   -709.656        -709.062         -709.594           -1886.06         -1882.62        -1879.66
  10168.4         10168.6          10168.4             10640.8          10641.2         10642.9
  -5651.0         -5649.78         -5649.19            -5908.84         -5907.38        -5905.12
  -5041.62        -5041.81         -5042.62            -5764.62         -5763.41        -5762.16
  -3191.5         -3190.28         -3189.12       …    -6197.94         -6197.0         -6193.91
  -5846.34        -5846.44         -5846.25            -4642.69         -4638.91        -4638.0
 -20708.8        -20708.8         -20708.5            -20613.6         -20611.0        -20604.5
  -6359.38        -6358.88         -6358.78           -10465.7         -10462.1        -10462.3
 -15747.6        -15746.7         -15746.2            -15928.1         -15926.6        -15918.5
   9301.91         9302.22          9301.62       …     7375.22          7381.66         7383.62
  -2409.69        -2409.84         -2409.75            -2123.25         -2122.28        -2121.12
  -1263.78        -1263.16         -1262.53            -2642.22         -2640.94        -2639.22
    448.469         448.188          448.031            1011.78          1012.97         1013.59
   -382.875        -381.75          -380.844            -810.281         -809.062        -806.469
 -11848.4        -11848.4         -11849.0        …   -11256.1         -11254.4        -11253.9
 -16851.3        -16850.3         -16849.5            -14972.6         -14970.4        -14964.7
  -3360.59        -3361.09         -3360.91            -7151.59         -7148.28        -7150.34
  -4518.25        -4517.22         -4514.62            -5964.06         -5965.94        -5960.03
      ⋮                                           ⋱                                   
   1549.56         1550.81          1551.97             1200.38          1201.44         1201.28
   1381.84         1381.34          1381.66             1561.5           1561.22         1561.12
   4015.81         4015.62          4016.97             3678.59          3677.62         3677.97
 -12764.7        -12763.5         -12763.3             -9742.38         -9742.94        -9743.91
   2191.94         2191.69          2192.44       …     2522.16          2520.84         2520.53
   3329.88         3330.62          3331.16             7352.25          7351.94         7351.56
   2812.12         2810.09          2810.56             4400.97          4400.31         4400.16
  -3223.5         -3222.59         -3220.78            -2742.72         -2742.84        -2742.44
   2728.31         2728.16          2729.91             4143.56          4142.62         4142.0
   6688.53         6688.09          6688.97       …     8595.28          8593.81         8593.0
  -4388.06        -4387.84         -4386.81            -5947.88         -5949.78        -5951.72
   1477.38         1476.16          1476.97             -590.281         -591.406        -592.031
   3284.84         3286.88          3289.28             6900.12          6897.31         6894.81
   4198.5          4195.69          4195.66             2106.12          2104.62         2103.56
   4181.34         4182.47          4184.03       …     6946.25          6944.0          6942.53
  -3047.16        -3047.72         -3046.56            -1383.38         -1385.28        -1386.94
   6727.59         6727.56          6727.53             8214.12          8212.44         8211.59
  -5177.62        -5177.22         -5175.75            -1853.28         -1854.84        -1855.5
     -2.47707f5      -2.4555f5        -2.43244f5     -375000.0        -375000.0       -375000.0
     -1.43189f5      -1.42602f5  -141919.0        …       -1.55748f5       -1.5571f5       -1.55674f5
 -10852.7        -10680.1         -10520.8             -4723.88         -4754.0         -4779.09
 -44044.8        -43267.7         -42408.4            -44461.3         -44594.9        -44767.2
 -25625.4        -24908.5         -24176.8                -1.19948f5  -120033.0            -1.20147f5

data_onda:

57×2034000 Matrix{Int16}:
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
 -15875  -15862  -15873  -15927     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  32767   32767   32767   32767  …   32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  -8565   -8540   -8520   -8530     -18157  -18127  -18099  -18041
 -28271  -28257  -28242  -28240     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
  10032   10025   10021   10019      22621   22632   22659   22674
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
      ⋮                          ⋱                          
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
  -7411   -7417   -7400   -7365      32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
 -32768  -32768  -32768  -32768  …  -32768  -32768  -32768  -32768
  32767   32767   32767   32767      26853   26851   26875   26871
  32767   32767   32767   32767      32767   32767   32767   32767
  32767   32767   32767   32767      32767   32767   32767   32767
  32767   32767   32767   32767      32767   32767   32767   32767
  30910   30899   30906   30934  …   32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
  32767   32767   32767   32767  …   32767   32767   32767   32767
  32767   32767   32767   32767      32767   32767   32767   32767
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768
  32767   32767   32767   32767      32767   32767   32767   32767
  32767   32767   32767   32767      32767   32767   32767   32767
 -32768  -32768  -32768  -32768  …  -30942  -30946  -30989  -31026
 -32768  -32768  -32768  -32768     -32768  -32768  -32768  -32768

jrevels commented 2 years ago

Hi! Thanks for using the package.

It'd be good to tag in a more current maintainer to check my guesses, but if I had to bet, it seems like the main problem is that OndaEDF doesn't seem to support BDF; it has a hardcoded assumption that signal element types will be Int16 (EDF's only signal type) whereas BDF support 24-bit.

It seems like it would be pretty easy to update OndaEDF to support this, though. I opened a draft PR to do this here: https://github.com/beacon-biosignals/OndaEDF.jl/pull/35

With this PR (using the data you provided):

julia> using EDF, OndaEDF, Onda

julia> bdf = EDF.read("data.bdf");

julia> bdf_samples = reduce(vcat, (transpose(EDF.decode(signal)) for signal in bdf.signals));

julia> onda, _ = OndaEDF.edf_to_onda_samples(bdf);

julia> eeg = only(s for s in onda if s.info.kind == "eeg");

julia> onda_fp1 = decode(eeg["fp1", :]);

julia> bdf.signals[2].header.label # just to prove that the second BDF signal is Fp1
"Fp1"

julia> bdf_fp1 = transpose(bdf_samples[2, :]);

# there may be slight inconsequential differences due to redithering that occurs during 
# conversion if the conversion necessitates re-encoding the signals; in that case the two 
# vectors won't be `==` but should still be `isapprox`
julia> isapprox(onda_fp1.data, bdf_fp1) 
true

This resolves the numerical difference in the converted data. Note that the data layout is different on purpose, though, as a result of translating to Onda. A few things to call out just in case it wasn't apparent:

OndaEDF only extracts EDF signals that it is "told" about, and will always try to sort extracted signals into the groups/order that it was "told" to. This is controlled by the custom_extractors argument, which is populated by default with the well-known signal labels defined by the EDF+ specification
Since Onda has a notion of multichannel signals, OndaEDF splits eeg and ecg into two separate multichannel signals as a result of the conversion (you can see I specifically grabbed the eeg signal above via only(s for s in onda if s.info.kind == "eeg")).

Finally, note that you can also just extract things yourself to Onda, and skip a lot of the stuff that OndaEDF tries to do for you. Here's how to do that (note, this should work on the current OndaEDF.jl release without my PR):

julia> using Tables

# grab the Onda.SamplesInfo and the relevant EDF.Signals for eeg
julia> eeg_info, eeg_channels = OndaEDF.extract_channels_by_label(bdf, ["eeg"], OndaEDF.STANDARD_LABELS[["eeg"]])[1][1];

# replace the computed encoding info with an unscaled floating point encoding
julia> eeg_info = Onda.SamplesInfo(Tables.rowmerge(eeg_info; sample_resolution_in_unit=1.0, sample_offset_in_unit=0.0, sample_type=Float32));

# grab our decoded data matrix
julia> eeg_data = reduce(vcat, (transpose(EDF.decode(c)) for c in eeg_channels));

# construct Onda.Samples object with our manually decoded data matrix and metadata
julia> eeg = Onda.Samples(eeg_data, eeg_info, false)

jrevels commented 2 years ago

Actually, it turns out that the Int24 BDF support issue I mentioned doesn't matter here. The thing I found there is a real edge case, but not one that your data actually hits! I thought I checked that it did before writing my reply but I guess I made a mistake 😅

So it turns out that your issue purely comes down to needing to make sure we were comparing decoded data (instead of mixed decoded/encoded data) and comparing data of the "correct shape" (e.g. properly comparing matching channels).

To demonstrate; let's call the manually extracted eeg::Onda.Samples in my previous comment bdf_eeg (since we manually extracted it from the BDF file ourselves). Now, on OndaEDF.jl's latest release (not my PR):

julia> onda, _ = OndaEDF.edf_to_onda_samples(bdf);

julia> onda_eeg = decode(only(s for s in onda if s.info.kind == "eeg"));

julia> isapprox(bdf_eeg.data, onda_eeg.data)
true

Since there's not actually a problem here (AFAICT), I'll close this issue. But thanks for filing it, though! It helped me find a new edge case, even though the edge case wasn't actually a problem here.

(shout out to @hannahilea for pointing this out in the Beacon Slack!)

likanzhan commented 2 years ago

@jrevels Sorry to bother again, I guess one reason I got a different result from yours is that I used the branch cv/edf-0.7 of OndaEDF.jl. This is because the data I obtained is bdf+, so the EDF.jl version should be the latest one.

In that case, the results are different, if I calculate correctly.

jrevels commented 2 years ago

Ah, I see! I can repro if I upgrade my EDF version. This does, in fact, seem like it's actually the issue I filed a fix via #35.

This must've been what I found earlier that caused me to go after #35 in the first place - I must've accidentally changed my package state or something while reproducing.

Since this isn't an issue with the released version of OndaEDF, I'll leave this closed, but since it is a bug with #29 we can continue conversation there.

beacon-biosignals / OndaEDF.jl

`edf_to_onda_samples` didn't return correct sample data as expected #34