Open BeAtS85 opened 4 years ago
yeah it would be nice if the file was parsed a line at a time so people could do something like this:
try:
out = []
for caption in webvtt.read_generator(path):
try:
line = caption.text # or even .text() would be fine
except webvtt.MalformedCaptionError:
pass
else:
out.append(remove_text_inside_brackets(line.replace("\n", " ")))
return out
except webvtt.MalformedFileError:
return []
~80% of my VTT files are malformed according to this library so as-is not super useful for my use case.... :/
Sometimes there are empty timestamps in the .vtt. The script errors out on them.
For example: 00:22:21.320 --> 00:22:26.520 00:21:13.720 --> 00:21:15.360 line:90% position:50% align:middle
Can this error somehow be captured or ignore the empty timestamps?