pietrop / digital-paper-edit-electron

Work in progress - digital paper edit project - Electron, Cross Platform Desktop app - Mac, Windows, Linux
https://pietropassarelli.net/autoedit
Other
103 stars 17 forks source link

Adding support for Google STT #49

Open pietrop opened 4 years ago

pietrop commented 4 years ago

As requested by @will0225, it be great to add support for Google STT.

Disclaimer

I personally don't have a need for Google STT at the moment, (quiet happy with the other STT options) but happy to provide guidance if anyone else wants to have a go and make a PR, as it would open up options for adding more languages etc..

Previous attempt in autoEdit 2

In autoEdit V2 I made an attempt https://github.com/OpenNewsLabs/autoEdit_2/issues/40#issuecomment-362235957 PR https://github.com/OpenNewsLabs/autoEdit_2/pull/97

Possible Issues

But run into a few issues


Before getting started

There's a few things to figure out / investigate

  1. can you get google STT node SDK to work inside electron? - To try this, you can either try in a fork of this repo, or do a simple demo project with the electron and the google STT node SDK to test it out.
  2. Decide how to handle the Google Cloud Storage logic. (see comment above)

To do give it a go

you can look at how it has been done for AssemblyAI.

dev setup

obv first thing first get the app run locally, seeREADME#setup

add it as an option in transcriber module

/src/ElectronWrapper/lib/transcriber

in the index.js file the switch statment /src/ElectronWrapper/lib/transcriber/index.js#L54 look at the case for AssemblyA and create a similar one for GoogleSTT.

create a GCP STT module in the transcriber module

Back in the transcriber module create a folder/ module for GoogleCloudSTT eg google-stt. Similar to the AssemblyAI one. This module will use the google cloud Node STT SDK to talk to google. and will have a module to convert the result into the DPE format used by autoEdit. (You can use this modulegcp-to-dpe to do the conversion)

Credentials

You'd notice that the assemblyAI transcriber module requires another module to get the credentials src/ElectronWrapper/lib/transcriber/assemblyai/index.js#L3.

Which means you'll need to modify it to support GCP STT as well src/stt-settings/credentials.js#L88

Credentials UI

Now we need to change the UI of the settings window to add the option for GCP STT, and allow to both add credentials and chose the language. This corresponds to the initial setup page in the user manual

Which means modifying these two react components

apologies that window view is all in one file, and not modularized, but it was for ease of development, as it would have been laborious to add another bundling step for a the settings window etc...

Anyway can provide more info on how to modify those components if needed once/if you get to it, this seems plenty for now.

pietrop commented 3 years ago

Some good news, this seems to be a new thing

You can retrieve the results of the operation using the google.longrunning.Operations method. Results remain available for retrieval for 5 days (120 hours). Audio content can be sent directly to Speech-to-Text from a local file, or the API can process audio content stored in { storage_name }. Audio files longer than 1 minute must be stored in a Cloud Storage bucket in order to be transcribed by Speech-to-Text. Performing asynchronous speech recognition on a local file longer than 1 minute will result in either an error or an incomplete transcription.

Transcribing long audio files using a local file

So in theory with this in mind it could be possible to integrate with the electron app after all. Altho need to revisit how you get and add credentials from GCP STT to run this locally.

pietrop commented 3 years ago

https://github.com/OpenNewsLabs/autoEdit_2/pull/97