ccoreilly / vosk-browser

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
Apache License 2.0
364 stars 60 forks source link

Suggestion: Editing text, interactive clickable transcript, export functions #19

Closed candideu closed 2 years ago

candideu commented 3 years ago

Hello!

Just wanted to say that I love what you're doing here! ❤️ The demo is amazing, and I can't wait to see how this project pans out.

A while ago, I proposed creating a FLOSS version of Otter.ai and Sonix over at Open Source Ideas: https://github.com/open-source-ideas/open-source-ideas/issues/288

I'm not sure if this is what you're envisioning for this project, but it would be interesting to have the ability to play back the audio and have the playback timed with the transcript. Clicking on a word could also toggle the playback to that point. (see Demo #6 of AblePlayer) Additionally, the ability to edit and export the text would be helpful for people who use transcriptions (for closed captioning, research, journalism, meeting minutes, etc.)

ccoreilly commented 3 years ago

Hi, glad you like the project :)

This project aims just to be a library that wraps a wasm build of vosk and the demo is just a demo of what can be done so I won't be adding such functionalities to the library itself.

I have thought of integrating transcription with vosk-browser to oTranscribe which I guess would achieve what you want.

I currently have no time for that but maybe someone can pick this up, would be really cool.

candideu commented 1 year ago

Hi! Looks like someone (@gullabi) built a fork of oTranscribe which merges Vosk Browser:

https://github.com/oTranscribe/oTranscribe/issues/107#issuecomment-1289004631

Concerning this issue, and as @candideu suggested we have developed our own fork to implement this functionality, and it can be found here.

In a nutshell, we have integrated the vosk-browser functionality in oTranscribe and additionally added automated timestamp feature. It can transcribe any file introduced from the file system with timestamps put for each minute. Since vosk-browser works offline, no file is communicated with outside and everything is done with the resources of the local machine.

We keep our repository as a fork since our intention is to introduce the changes to the original repo, if the maintainer is interested.

Finally, we will provide with a publicly available deployed version of the app and a desktop version soon. But in the meantime we appreciate any help, QA or suggestions from the community.