beacon-biosignals / EDF.jl

Read and write EDF files in Julia
MIT License
18 stars 5 forks source link

`EDF.decode` overflows for large physical minimum/maximum #67

Open kleinschmidt opened 1 year ago

kleinschmidt commented 1 year ago

It multiplies by pmax - pmin which difference will overflow if you're within a factor of 2 of the maxfloat(Float32), and the multiplication will likely overflow if you're within ~32k (typemax(Int16)) of maxfloat(Float32).

However, I think that this is not really likely to be an issue; both our implementation and that used in PyMNE (EDFlib) write the physical min/max in a decimal form with at most 8 ASCII characters and no scientific notation, so the biggest value they can represent is well below the values that cause overflow here. So IMO it's better to fix this by adding a check to teh SignalHeader constructor that prevents you from constructing headers that would overflow; as a bonus it might be good to throw an error if the stringified physical min/max are >8 characters (rather than waiting until you try to actually write the file to find out).

If we don't wanna restrict the signal header to values that can be stringified into 8 ASCII characters, I think because we're first dividing by the digital range (max - min) in decode, we should probably err on the side of caution and prevent anything that would cause overflow for any integer dmin/dmax values (cna't rule out that someone would put some kind of cursed very small fractions in there but that gives you something on the order of 1e8, so 1e12 with the 1e4 from the typemax(Int16) before multiplying by the physical range. So if we cap those to 1e20 we'll be safe I think, and that still seems well outside the range of values you're likely to encounter in the real world...

ararslan commented 1 year ago

xref https://github.com/beacon-biosignals/EDF.jl/issues/31

Tangentially relevant but EDF+ has a specification for how to handle values that are too large to be represented in 2 bytes. (Spoiler alert: you just take a log.)