glut23 / webvtt-py

Read, write, convert and segment WebVTT caption files in Python.
MIT License
188 stars 56 forks source link

MalformedCaptionError: Missing timing cue in line ... #46

Closed barbagus closed 1 year ago

barbagus commented 1 year ago

Hello there, I often run into that error (blank line between time-code and text):

00:45:15.640 --> 00:45:16.760 line:91% align:center
Quelqu'un en a parlé ?

00:45:16.960 --> 00:45:20.600 line:79% align:center

Haddock l'a vu sur le pont 4
et m'a demandé mon impression.

00:45:20.920 --> 00:45:22.040 line:91% align:center
Qui est Haddock ?

I did not check yet whether it is a break in the standard from my sources or if your implementation did not take that situation into account.

barbagus commented 1 year ago

As per MDN, it seems my source is not compliant.

The payload is where the main information or content is located. In normal usage the payload
contains the subtitles to be displayed. The payload text may contain newlines but it cannot contain a
blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a cue.

I will try to preprocess my file before parsing it with webvtt-py. Sorry for the noise.