adracea / rsubs-lib

rust library for subtitle manipulation and conversion
https://crates.io/crates/rsubs-lib
MIT License
12 stars 7 forks source link

parse cues with multiple newlines, removed dbg! #35

Closed andreadev-it closed 1 year ago

andreadev-it commented 1 year ago

Hello adracea, I've been using your library for a personal project, thank you for the effort you've put into this! I've noticed that in some old vtt files the cues where seperated by multiple newlines instead of just one. Also, the times where expressed in "00:00.000" (only two segments seperated by ":" instead of 3). Looking at the specs (in the box, at point 7) it actually allows you to have multiple newlines between cues, so it wasn't a deprecated feature. Running this library on such a file, would result in 2 problematic behaviours:

  1. it crashed because of the multiple newlines
  2. it printed all the splitted times

In this pr, I used a regex to match for multiple newlines and removed the dbg! that was printing the splitted times. I've run the the tests and all passed.

Hope this can be useful

adracea commented 1 year ago

@andreadev-it Hey that's awesome! I was honestly not expecting people to use it, much less actually end up contributing!

Would it be too much to ask to maybe add a few lines to expand one of the fixture/test files to include this?

andreadev-it commented 1 year ago

Sure, no problem 👍🏼 I've looked into the test suite, but it's a little too complicated for me to intervene and add relevant tests for this. Basically, what this change will support is having multiple newlines between cues in the VTT files. This is an example of a file that would have previously crashed (or at least wouldn't parse correctly) that is taken from the fixtures and slightly edited:

WEBVTT

00:11.000 --> 00:13.000
<v Roger Bingham>We are in New York City

00:13.000 --> 00:16.000
<v Roger Bingham>We’re actually at the Lucern Hotel, just down the street

00:16.000 --> 00:18.000
<v Roger Bingham>from the American Museum of Natural History

00:18.000 --> 00:20.000
<v Roger Bingham>And with me is Neil deGrasse Tyson

00:20.000 --> 00:22.000
<v Roger Bingham>Astrophysicist, Director of the Hayden Planetarium
adracea commented 1 year ago

Awesome, I'll publish the new version later this evening!