pietrop / slate-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. Using the SlateJs editor.
https://pietrop.github.io/slate-transcript-editor
Other
73 stars 33 forks source link

fixes issue #28 - VTT speakers and paragraphs export broken #49

Closed pietrop closed 3 years ago

pietrop commented 3 years ago

Is your Pull Request request related to another issue in this repository ?

Issue https://github.com/pietrop/slate-transcript-editor/issues/28 PR https://github.com/pietrop/slate-transcript-editor/pull/48

Describe what the PR does

after recent changes in alignment logic, passes slate value and uses the slate paragraph blocks to create the subtitles json to pass to the vtt generator to preserve the text editor paragraphs in the vtt etc..

State whether the PR is ready for review or whether it needs extra work

Ready for review

Additional context

I could use some help testing/double checking it this doesn't introduce bugs in the subtitles creation for vtt. Eg by using the slateJs value content. Does it create probls if the content has not recently being aligned?

I am thinking no, coz the start and end time of paragraphs is generally preserved even if the text in the paragraphs is modified etc.. but would like a second opinion 😄

jshearer commented 3 years ago

Hey @pietrop, sorry it took me a sec to test this. I just installed and tried this, and the export works as expected where it crashed before.

pietrop commented 3 years ago

I want to double check some edge cases before merging, like if you delete several words inside a paragraphs, and export as vtt paragraphs, does it give you the right timecode for that paragraph or the old one before the deletes? (eg not sure if it runs a re-alignment before that export)

if you see what I mean and have a chance to test it out feel free to give it a go

jshearer commented 3 years ago

Okay just tested adding some words at the end of a paragraph, and exporting. The VTT paragraph timecodes are different -- went from 0:29 to 0:31, so realignment must have run (though FWIW it was super fast probably thanks to your refactor). That said, the transcript editor itself still says 0:29, so the realignment didn't get applied back to the actual editor. This... seems fine.

jshearer commented 3 years ago

Hey @pietrop -- how are you feeling about this? I tested it out and it seems to work on our end

pietrop commented 3 years ago

That's great, thanks, and sorry for the delay