SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
8.69k stars 908 forks source link

[Bug Report] Conversion of TTML to SRT time-stamps does not reflect the frame-rate #2334

Closed zxsd closed 7 years ago

zxsd commented 7 years ago

Ref: Appendix H. Time Expression Semantics, where "tickRate" or tick-rate is described.

Summary: When saving the file as an SRT-file using SubtitleEdit v3.5.2, the tick-rate element of the "time expressions" in a TTML/XML subtitle file appears to be incorrectly converted. The tick-rate element apparently should reflect the frame-rate of the video file.

Discussion: The tick-element in the TTML-Timestamp ("hh:mm:ss:tt") is not correctly converted to the equivalent SRT-Timestamp. Instead of the tick element (":tt") being converted into the correct seconds-fraction (tt/25 ... the correct fraction according to @Georg-J, which presumably results from the PAL-Norm frame-rate of 25 FPS *), the tick is incorrectly handled as if it represents thousandths-of-a-second (tt/1000).

This deficiency was identified tangentially related to the automated conversion of TTML-subtitle files published by Austrian Public Television (orf.at) as part of their video-on-demand offerings.

Representative time-expressions were used to illustrate the deficiency:

(TTML) begin="00:30:00:00" end = "00:30:01:23" . . . becomes (after using File | Save as | SRT) (SRT) 00:30:00,000 --> 00:30:01,023

The above TTML entry results in the subtitle being displayed for almost 2 seconds ... 1 and 23/25 seconds (1.920 seconds) to be precise. After conversion, the SRT-file's corresponding entry has a considerably shorter display duration, just over 1 second ... or 1.023 seconds.


The source file (TTML/XML) that was being discussed can be viewed here (".txt" was appended to the files to enable posting). The converted file (using SE v3.5.1) is here.


* The conversion of TTML --> SRT time-stamps appear to be dependent on the frame-rate. The Pro version of FAB-Subtitler (https://www.fab-online.com/eng/subtitling/production/subtstd.htm) is principally used to produce the subtitles of German-language public television, primarily for Teletext (broadcast TV, e.g., Videotexttafel 150). FAB-Subtitler references the following frame-rates:

niksedk commented 7 years ago

I think this issue has been fixed via #2329 (was due to empty time codes in the end). The fix is included in latest beta: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.2/SubtitleEditBeta.zip Could you verify?

zxsd commented 7 years ago

I can certainly live with the Build 161 logic, although the frame-rate that I calculated is not linear ... and also not 25 FPS. If the frame-rate I observed is the desired behavior, then the fix in https://github.com/SubtitleEdit/subtitleedit/issues/2329 is apropos, and this issue can be closed.

Here's a (busy) picture attempting to summarize my check:

se_3 5 2_build161_comparison

I considered the first 6 subtitles, comparing the native TTML time expression with the timestamp in SE 3.5.2 build 161. To the right of the picture there are two Notepad windows:

While the SE conversion realized with Build 161 is technically wrong, the provider of this TTML file (orf.at) did not see fit to include the information that normally conveys the frame-rate in the subtitle file: ttp:frameRate. Absent an asserted frame-rate (or the presence of a related value, such as ttp:timeBase == Media which makes the element moot), determining the frame-rate of the associated video-file seems to be problematic.

(Aside: In my experience, all MP4-files published via the ORF-Mediathek have a frame-rate of 25 FPS. The same holds true for the other German-language public-television corporations except certain HD files, which have a frame-rate of 50 FPS.)

@niksedk If you decide to leave this open, I'll experiment tomorrow night (UTC -5hrs here) with incorporating frame-rate information into the original XML-file, and report results.

zxsd commented 7 years ago

Rather than add the element to the original ORF TTML-file, I modified an old Norddeutscher Rundfunk TTML-file with fractional time-expressions, so that it contained appropriate frameRate time-expressions. (At the time the NDR still added 10 hours to the time-expressions; they've since stopped this nonsense.) Here's the file with which I began (again with .txt appended to allow it to be posted):

NDR-TTML-File__with-thousandths-Expression.ttml.txt

Using the Replace all functionality of Notepad++, I changed the fractional values (ss.ttt) into equivalent frameRate values (ss:ff) as illustrated below:

creating_frameratetimed_ttmlfile

The file that resulted is here:

NDR-TTML-File__with-frameTime-Expression.ttml.txt

And here's a comparison of the expected/actual time values of the first six subtitles after loading the hh:mm:ss:ff version of the file into SE:

expected vs actual


I can still live with the Build 161 logic in spite of the (presumed) hard-coded approximation for the frame-rate; after all, the errors don't stack. If the canned, non-linear frame-rate is the desired behavior, then the fix in https://github.com/SubtitleEdit/subtitleedit/issues/2329 is acceptable for my usage.


Someone using SubtitleEdit in production (vice my casual use) might not be satisfied with the current conversion-behavior. If the original TTML-file was correctly crafted, using SE to convert it results in an SRT-File with sub-optimal timings. This shortcoming could become more meaningful as time passes, and the EU starts really implementing EBU-TTML. Up until now, simplistic implemenations have been most common with the German-language public television offerings.

The Swiss Public Television Corporation, as an exception to the above 'rule,' is fine-tuning their forward-leaning implementation (begun in earnest several years ago). While the BBC seems to be making only plodding progress, other nations are likely not waiting. (I only have familiarity with German-language subtitles.)

The German Public Television First Program is steadily enhancing it's implementation, as shown by the recent TTML-file linked below. While ARD continues the nonsense of adding 10 hours to the time expressions (presumably so legacy files continue to work with the ARD-proprietary Player), their framework has matured over the past year ... and the entry ttp:timeBase="media" makes the frameRate moot ... for this specific show's resolution:

ARD Großstadtrevier (20170410) - Folge 396_ Für nix und für alles 960-1.mp4 185000 (540pw, ARD).ttml.txt


I do not anticipate contributing further, @niksedk, so I'll defer to you with respect to closing this issue.

niksedk commented 7 years ago

@zxsd: thx for the interesting reading :) I hope the TTML version used will be a newer one than the draft version of October 2006 ;)

SE uses a default frame rate when converting frames with unknown frame rate to milliseconds - you can set the default frame rate in Options -> Settings -> General - "Default frame rate". In the "Toolbar" tab (in settings) you can show the frame rate in the toolbar if frame rate is something you change/use often. With a frame rate of "25" I looks like SE calculates the values you expect.