pietrop / slate-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. Using the SlateJs editor.
https://pietrop.github.io/slate-transcript-editor
Other
75 stars 33 forks source link

fix: Creating a new paragraph at the end of an existing one #40

Closed jshearer closed 3 years ago

jshearer commented 3 years ago

Currently, trying to add a paragraph by pressing enter with the cursor at the end of an existing paragraph fails with the following error:

index.js:65 Uncaught (in promise) TypeError: Cannot read property 'start' of undefined
    at Object.handleSplitParagraph (index.js:65)
    at _callee4$ (index.js:458)
    at tryCatch (runtime.js:63)
    at Generator.invoke [as _invoke] (runtime.js:293)
    at Generator.next (runtime.js:118)
    at asyncGeneratorStep (asyncToGenerator.js:3)
    at _next (asyncToGenerator.js:25)
    at asyncToGenerator.js:32
    at new Promise (<anonymous>)
    at asyncToGenerator.js:21
    at handleOnKeyDown (index.js:449)
    at isEventHandled (index.es.js:2309)
    at index.es.js:2025
...

Which is actually this line:

https://github.com/pietrop/slate-transcript-editor/blob/0607306b06732b9b0690abed5a53ecee19037a23/src/components/slate-helpers/handle-split-paragraph/index.js#L65

because there are no words after the cursor in the current paragraph.

Describe what the PR does
I figured that a decent solution would be to just take the end time of the most recent previous word. Thooghts?

State whether the PR is ready for review or whether it needs extra work

Ready!

pietrop commented 3 years ago

Good catch, thank you soo much for trying this out @jshearer ! 🙌

I am sure this happens. But I can't reproduce in the storybook version at https://pietropassarelli.com/slate-transcript-editor so a bit more details context around the example when that happens would be useful to narrow it down/reproduce.

oh, Is this issue on 0.1.2 or on 0.1.2-alpha.5?

I am leaning towards not allowing to hit enter at the end of a paragraph. Coz that would create an empty paragraph and mess up the whole paragraph/block level alignment thing. If that makes sense?

Unless I am missing a scenario where that might be useful?

jshearer commented 3 years ago

Interesting, I can't reproduce it there either... I am actually using the tip of the master branch currently since I didn't see 0.1.2-alpha.5. I'll try alpha.5 and see if I can still reproduce it

pietrop commented 3 years ago

I was able to repelicate the issue by chance

   "slate-transcript-editor": "^0.1.2-alpha.6",

Used in pietrop/digital-paper-edit-firebase with GCP STT in Italian language (not that, that's relevant for the issue, but just for context in case it turns out that it is)

Screen Shot 2021-02-23 at 11 02 13 PM
pietrop commented 3 years ago

one thing I noticed in firebase/firestore is that my last paragraph does not have a startime.

Screen Shot 2021-02-23 at 11 08 40 PM

@jshearer are you using GCP STT as well with pietrop/gcp-to-dpe adapter?

pietrop commented 3 years ago

ok try with 0.1.2-alpha.7 Might have a fix in https://github.com/pietrop/slate-transcript-editor/commit/04b135ef67d3910052f994ec57e6b17027d0cac3 by adding a check for when cursor is at the end of paragraph and preventing creating a new empty line/paragraph.

It doesn't fix this issue https://github.com/pietrop/slate-transcript-editor/pull/40#issuecomment-784741595 but that might be a problem with the adapter. Looking into it.

pietrop commented 3 years ago

This might fix my problem https://github.com/pietrop/gcp-to-dpe/pull/7 but not 100% sure fixes the overall probl.

pietrop commented 3 years ago

Hopefully this is more substantial to address this issue https://github.com/pietrop/slate-transcript-editor/pull/42 by refactoring the function to generate previous timings for the words highlighting (via css injection) so that it doesn't rely on actual timings of the words but just does a list of int up to the "current time" this should remove one more possible arbitrary point of failure.

pietrop commented 3 years ago

@jshearer There's a new release slate-transcript-editor@0.1.2-alpha.9 let me know if you get a chance to try it with this one and if it persists and/or whether we can close this issue.

jshearer commented 3 years ago

alpha.9 certainly doesn't throw an uncaught exception any more, though it doesn't allow for the insertion of a blank paragraph.

pietrop commented 3 years ago

👋 @jshearer Thanks for looking into it, and for closing the issue.

Not allowing to insert a blank paragraph is by design. To constraint the edge cases to handle in the attempt to preserve accurate word level timecodes.

But let me know if you have an example of when you'd need to insert a blank paragraph?
(where it wouldn't involve splitting the previous paragraph, for instance, at the last word in the paragraph for example)

This generally assumes that the STT is decent at the very least at recognizing the majority of the utterances and therefore the times for those could be preserved / used as a starting point alignment and restoring timecodes for accurate/corrected word's text. In this scenario splitting a new paragraph, would not have associated utterances with timecodes info. if that makes sense?

Curious whether there might be an edge case I haven't considered yet.