sandflow / imscJS

JavaScript library for rendering IMSC Text and Image Profile documents to HTML5
BSD 2-Clause "Simplified" License
84 stars 31 forks source link

Timecode rounding errors in some cases #161

Open spoeschel opened 5 years ago

spoeschel commented 5 years ago

For the following EBU-TT-D file untertitel-36573.xml (please rename to .xml), the resulting ISD timecodes are in some cases affected by rounding errors, e.g. instead of 91.96, the value 91.96000000000001 is shown (and used for the corresponding PNG filename during export).

All the begin/end timecodes themselves only use three digit precision; no framerate is used. I also cannot see any pattern why one timecode is affected while another one is not.

I tested with https://sandflow.com/imsc1_1/index.html together with Firefox 68.0.1 and Chromium 76.0.3809.87; Firefox on Windows and Linux; Chromium only on Linux.

palemieux commented 5 years ago

This is a fundamental limitation of floating point arithmetic. For example, 24000/1001 = 23.976023976023976... cannot be represented with a finite number of decimals. As a result the computer stores it as 23.976024627685546875, so that 792 / 23.976024627685546875 is not equal to 33.033 exactly. In fact, 33.033 cannot be represented exactly as a binary floating point number (even though it can as a decimal floating point number), i.e. 33.033 is approximated to 33.033000946044921875 internally.

The right way to handle time expressions is to use rational numbers instead of floating point numbers, i.e. store 33.033 as 33033/1000.

imscJS could certainly be modified to do that.

spoeschel commented 5 years ago

Ah, I see, the JavaScript data types... :-/

As a quick fix, would it possibly make sense to just apply rounding to a meaningful number of decimal places here, to address these corner cases?

For example rounding to eight decimal places could be a solution here: On the one hand this number of digits would be high enough to not interfere with any frame precision i.e. keeping the relevant part of the actual result. On the other hand it would be low enough to fix the imprecision issue. And in both cases there would be sufficient distance to the affected digits.

palemieux commented 5 years ago

As a quick fix, would it possibly make sense to just apply rounding to a meaningful number of decimal places here, to address these corner cases?

Yes, the caller can choose to round the values returned by getMediaTimeEvents() to meet its specific requirements.

spoeschel commented 5 years ago

OK, good to know. Do you maybe plan to apply such fix also to the present repo?

Ryuno-Ki commented 5 years ago

Another approach I see often taken is to use integers. So basically treating it with a finer unit (e.g. in ms instead of seconds).

I've heard something about BigInt making its way into some platforms.

palemieux commented 4 years ago

Another approach I see often taken is to use integers. So basically treating it with a finer unit (e.g. in ms instead of seconds).

Yes. This is the typical approach in media formats, e.g. MXF: an edit rate is expressed as a rational, e.g. 30000/1001, and offsets are then expressed as integer multiple of the edit rate, e.g. 1110 means 1110 * 30000/1001 s.