Closed likanzhan closed 2 years ago
Hi! Thanks for using the package.
It'd be good to tag in a more current maintainer to check my guesses, but if I had to bet, it seems like the main problem is that OndaEDF doesn't seem to support BDF; it has a hardcoded assumption that signal element types will be Int16
(EDF's only signal type) whereas BDF support 24-bit.
It seems like it would be pretty easy to update OndaEDF to support this, though. I opened a draft PR to do this here: https://github.com/beacon-biosignals/OndaEDF.jl/pull/35
With this PR (using the data you provided):
julia> using EDF, OndaEDF, Onda
julia> bdf = EDF.read("data.bdf");
julia> bdf_samples = reduce(vcat, (transpose(EDF.decode(signal)) for signal in bdf.signals));
julia> onda, _ = OndaEDF.edf_to_onda_samples(bdf);
julia> eeg = only(s for s in onda if s.info.kind == "eeg");
julia> onda_fp1 = decode(eeg["fp1", :]);
julia> bdf.signals[2].header.label # just to prove that the second BDF signal is Fp1
"Fp1"
julia> bdf_fp1 = transpose(bdf_samples[2, :]);
# there may be slight inconsequential differences due to redithering that occurs during
# conversion if the conversion necessitates re-encoding the signals; in that case the two
# vectors won't be `==` but should still be `isapprox`
julia> isapprox(onda_fp1.data, bdf_fp1)
true
This resolves the numerical difference in the converted data. Note that the data layout is different on purpose, though, as a result of translating to Onda. A few things to call out just in case it wasn't apparent:
OndaEDF only extracts EDF signals that it is "told" about, and will always try to sort extracted signals into the groups/order that it was "told" to. This is controlled by the custom_extractors
argument, which is populated by default with the well-known signal labels defined by the EDF+ specification
Since Onda has a notion of multichannel signals, OndaEDF splits eeg
and ecg
into two separate multichannel signals as a result of the conversion (you can see I specifically grabbed the eeg
signal above via only(s for s in onda if s.info.kind == "eeg")
).
Finally, note that you can also just extract things yourself to Onda, and skip a lot of the stuff that OndaEDF tries to do for you. Here's how to do that (note, this should work on the current OndaEDF.jl release without my PR):
julia> using Tables
# grab the Onda.SamplesInfo and the relevant EDF.Signals for eeg
julia> eeg_info, eeg_channels = OndaEDF.extract_channels_by_label(bdf, ["eeg"], OndaEDF.STANDARD_LABELS[["eeg"]])[1][1];
# replace the computed encoding info with an unscaled floating point encoding
julia> eeg_info = Onda.SamplesInfo(Tables.rowmerge(eeg_info; sample_resolution_in_unit=1.0, sample_offset_in_unit=0.0, sample_type=Float32));
# grab our decoded data matrix
julia> eeg_data = reduce(vcat, (transpose(EDF.decode(c)) for c in eeg_channels));
# construct Onda.Samples object with our manually decoded data matrix and metadata
julia> eeg = Onda.Samples(eeg_data, eeg_info, false)
Actually, it turns out that the Int24 BDF support issue I mentioned doesn't matter here. The thing I found there is a real edge case, but not one that your data actually hits! I thought I checked that it did before writing my reply but I guess I made a mistake 😅
So it turns out that your issue purely comes down to needing to make sure we were comparing decoded data (instead of mixed decoded/encoded data) and comparing data of the "correct shape" (e.g. properly comparing matching channels).
To demonstrate; let's call the manually extracted eeg::Onda.Samples
in my previous comment bdf_eeg
(since we manually extracted it from the BDF file ourselves). Now, on OndaEDF.jl's latest release (not my PR):
julia> onda, _ = OndaEDF.edf_to_onda_samples(bdf);
julia> onda_eeg = decode(only(s for s in onda if s.info.kind == "eeg"));
julia> isapprox(bdf_eeg.data, onda_eeg.data)
true
Since there's not actually a problem here (AFAICT), I'll close this issue. But thanks for filing it, though! It helped me find a new edge case, even though the edge case wasn't actually a problem here.
(shout out to @hannahilea for pointing this out in the Beacon Slack!)
@jrevels Sorry to bother again, I guess one reason I got a different result from yours is that I used the branch cv/edf-0.7
of OndaEDF.jl
. This is because the data I obtained is bdf+
, so the EDF.jl
version should be the latest one.
In that case, the results are different, if I calculate correctly.
Ah, I see! I can repro if I upgrade my EDF version. This does, in fact, seem like it's actually the issue I filed a fix via #35.
This must've been what I found earlier that caused me to go after #35 in the first place - I must've accidentally changed my package state or something while reproducing.
Since this isn't an issue with the released version of OndaEDF, I'll leave this closed, but since it is a bug with #29 we can continue conversation there.
In the Dropbox Folder, I uploaded a sample data recorded in our study. We have a data file
data.bdf
and a trigger data fileevt.bdf
. Normally the data is processed with Matlab,readbdfdata.m
.For now, we can use the
EDF.read()
to read the data, and theEDF.decode
d data are similar to the results obtained with MatLab (appendix 1)But the output of the
edf_to_onda_samples()
function is totally different from that of the supposed results (appendix 2):What is the problem here?
Appendixes:
data_edf
:data_onda
: