beacon-biosignals / EDF.jl

Read and write EDF files in Julia
MIT License
18 stars 5 forks source link

Reads take approximately 10x longer than BDF.jl #50

Open klaff opened 3 years ago

klaff commented 3 years ago

Using bdf_test.bdf file (https://github.com/beacon-biosignals/EDF.jl/blob/master/test/data/bdf_test.bdf) and the following MWE:

using BDF
using EDF
using BenchmarkTools

# BDF and EDF have very different APIs

get_ch1_using_BDF(fn) = readBDF(fn)[1][1,:]

get_ch1_using_EDF(fn) = EDF.decode(EDF.read(fn).signals[1])

test_filename = "bdf_test.bdf"

println("using BDF:")
@btime get_ch1_using_BDF(test_filename)
println("\nusing EDF:")
@btime get_ch1_using_EDF(test_filename)

I obtain the following:

using BDF:
  755.700 μs (305 allocations: 2.61 MiB)

using EDF:
  17.460 ms (146253 allocations: 2.87 MiB)

Hint: EDF is allocating once for every 3 bytes of data (this file has 24-bit data)

ararslan commented 3 years ago

EDF is allocating once for every 3 bytes of data

Indeed, this seems to boil down to the read method from BitIntegers.jl allocating once per read. I'm not sure why just yet.

BDF.jl doesn't bother with maintaining the 24-bit on-disk representation and instead reads and converts directly to Int32, which does not incur any extra allocations.

ararslan commented 2 years ago

I'm not sure why just yet.

Alignment padding

likanzhan commented 2 years ago

Is it possible to speed up EDF?

palday commented 2 years ago

@likanzhan The reading of EDF files should be fine -- it should only be BDF that's impacted. If there has been a change in performance for EDF between EDF.jl v0.6 and v0.7, then that would be good to know, especially if you can provide details on the file and platform.

likanzhan commented 2 years ago

@palday Thanks. Unfortunatly, the data format we obtained is "BDF+". The platform is a customized one, called Neuracle. To be specific, the data and the event information were stored in different files, i.e., data.bdf and evt.bdf. Here is one data we recorded. Thanks.