pietrop / slate-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. Using the SlateJs editor.
https://pietrop.github.io/slate-transcript-editor
Other
73 stars 33 forks source link

fix: `VTT with speakers` sometimes crashes when there are blanks #48

Closed jshearer closed 3 years ago

jshearer commented 3 years ago

The error is an index-out-of-bounds error at src/util/export-adapters/subtitles-generator/index.js:35

Uncaught (in promise) TypeError: Cannot read property 'end' of undefined
    at index.js:35
    at Array.map (<anonymous>)
    at addTimecodesToLines (index.js:28)
    at preSegmentTextJson (index.js:86)
    at subtitlesComposer (index.js:95)
    at exportAdapter (index.js:61)

The issue appears to be that blank lines (paragraphs, I think? Maybe sentences, or chunks as needed for screen-presentable subtitles?) were being treated as having one word in them, causing an out-of-bounds array access.

pietrop commented 3 years ago

@jshearer thanks for this!

What happens when you try and export vtt with speakers and paragraphs in storybook loccally?

I get this error

Screen Shot 2021-03-05 at 4 59 59 PM Screen Shot 2021-03-05 at 5 00 39 PM
pietrop commented 3 years ago

I tried this in addTimecodesToLines in src/util/export-adapters/subtitles-generator/index.js#L30

if you console log lines. What I get is an array, with an empty string + a string containing the whole of the text. Doesn't seem like it's respecting the paragraphs as intented.

Anyway I tried this, which is removing the empty string.

  const results = lines
+    .filter((l) => {
+     return l;
+   })
    .map((line) => {

But then you get a VTT with "one line" that contains the whole text


WEBVTT

1
00:00:01.410 --> 00:20:17.438
<v James Jacoby>So tell  me, let’s start at the beginning. How’d you get to Facebook in the beginning? So I joined the company in the late summer of 2005. At the time, I was an independent designer and developer working in San Francisco.
...
pietrop commented 3 years ago

Tried an alternative https://github.com/pietrop/slate-transcript-editor/pull/49 see what you think

jshearer commented 3 years ago

Aha, I see what you're saying about not respecting the paragraph splits -- I only tested that this fix stops the bug, not that the proper functionality is there. Good catch, 👀 at #49 now

jshearer commented 3 years ago

Closed in favor of #49