pietrop / slate-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. Using the SlateJs editor.
https://pietrop.github.io/slate-transcript-editor
Other
75 stars 33 forks source link

Add word highlights #21

Closed alexandrebrilhante closed 3 years ago

alexandrebrilhante commented 3 years ago

@bbc/react-transcript-editor highlights word as they are pronounced and allows to jump on a word's timestamp by double-clicking on it. Would this be possible to add this feature? Right now it jumps add the beginning of the paragraph.

pietrop commented 3 years ago

Thanks @brilhana πŸ‘‹ Provided there's a way to introduce this feature without introducing performance issues (see https://github.com/bbc/react-transcript-editor/issues/150) then a PR would be more then welcome.

As much as I'd prefer the highlights to be at word level, at the moment it is not very high in my priority list for this component as I haven't had any of users complaining about the paragraph level vs word level eg in autoEdit.io or in pietrop/digital-paper-edit-firebase.

if you or anyone else is interested in taking this on, I am happy to provide guidance and advice for a PR.

Javeed-JargonHandlers commented 3 years ago

Hi @pietrop

I would like to integrate the word level highlight. Could you please guide us on integrated this?

pietrop commented 3 years ago

πŸ‘‹ Hi @Javeed-JargonHandlers! Absolutely, how familiar are you with slateJs?

One way could be to render the words as leaves in slate but this has been done by others and is not very performant for transcripts over 20min/1 hour.

But maybe there's a way to only turn this on for the current paragraph (maybe using currentTime var)?

I think best would probably be to do a little research and come up with a list of options, and then go through some pro and cons before deciding which one to try out. Unless you already have one in mind that you'd like to explore?


one crazy idea πŸ’‘ you know how in CSS you cannot select words within text inside a div (last I checked) unless is broken up into some sort of html tags like span etc... But there's the pseudo selector before and after. that maybe could be used to select/recreate and overlap all the words before within a paragraph, and eg make them of a diff color etc... may not work, but maybe worth a short. From css class selector to select text inside a div+example

RoseVijay commented 3 years ago

Hi @pietrop I use slate-transcript-editor for audio transcription. Please, could you help me on how to get the value of text on double click as well as on select from a sentence.

pietrop commented 3 years ago

Hi @RoseVijay What are you trying to do?

There could be a way to do it using getSelectionNodes but it depends on the ultimate goal :)

RoseVijay commented 3 years ago

Hi @pietrop I am trying to copy the text on select or on double click. https://prnt.sc/10b3waf

pietrop commented 3 years ago

Ok, fine. But to do what?

cmd +c can already give you that text. What are you trying to do with it if you don't mind me asking?

Do you also need the associated time-codes or just the text?

RoseVijay commented 3 years ago

@pietrop I want to get the exact text on double click or on select. I will fetch data(exact selected text) only through mouse action and not on keyboard. By doing so, I will be performing some actions on it. https://prnt.sc/10b3waf

pietrop commented 3 years ago

What actions? πŸ™ƒ

For on select, you can use a modified version of getSelectionNodes and pass a callback to it from the parent component to get the selected text programmatically.

for double click, that’s easier but will only give you a single word. You can look at the onDoubleClick event and pass callback from parent the. Use modified version of getSelectedNodes to give you the text.

If that makes sense?

pietrop commented 3 years ago

This has been added in most recent version see in up to date storybook demo

❌ highlights word as they are pronounced βœ… allows to jump on a word's timestamp by double-clicking on it.

word level highlights is not present, because of trade of with performance. But open to have this open as a separate issue and consider alternatives and options that don't sacrifice performance to potentially re-introduce this feature.

peps32 commented 2 years ago

@brilhana @Javeed-JargonHandlers @bodawen I'm curious... did you make any progress on the "words highlighted at current time / as they're pronounced" feature?

pietrop commented 2 years ago

There has been some progress done here as well https://github.com/pietrop/slate-transcript-editor/issues/75 and as mentioned there, happy to re-open this issue if there's enough interest, and I'd also welcome a PR πŸ₯³