noophq / subtitle

Convert subtitles from one format to another format. Supported formats: STL EBU, TTML SMI, VTT, SRT
GNU Lesser General Public License v3.0
100 stars 40 forks source link

Make STL parsing more tolerant and configurable #18

Closed frne closed 1 year ago

frne commented 3 years ago

There exist some STL files in the wild, which contain userdata not parsable using the charset from GSI. This is the case for some kinds of software, which carry meta information in EBN-254 text fields. This is intended to make the StlParser usable, even if producers of subtitles incorrectly implement the EBU STL standard.

Also, under certain circumstances, the TCF offset (in GSI) should not automatically be subtracted from subtitle cue timing. The EBU STL spec is not very strict concerning the later. One could also argue, that a "Parser" should not alter data in any kind, which the StlParser does, when subtracting TCF from cues start and end timestamps. In that case, the introduces flag should be defaulted to true, which in turn would break backwards compatibility.

StlParser: Gracefully parse text field (TF)

Fall back to default charset in case of CoderMalfunctionError when reading the TF String.

StlParser: Flag to ignore Userdata

Ignore EBN 254 Userdata if flag is set to true

StlParser / StlObject: Flag to ignore TCF field

Ignores TCF timestamp in GSI and subtracts nothing, if flag is set to true

Also added .gitignore

frne commented 3 years ago

This PR is a followup and replacement of #16