jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.59k stars 176 forks source link

Please create a node version #253

Open 55Cancri opened 1 year ago

55Cancri commented 1 year ago

A node version will give access to a whole new segment of developers and allow word-level timestamps to happen directly inside of a node lambda after generating audio with openais recent tts api. By having this in javascript, it will also make it easier for js devs to create audio-text synchronization or "karaoke-style" highlighting in their user interfaces when reading text.

electro199 commented 1 year ago

you can make a wrapper over the C implementation of whisper then use exe to get json and combine those or just call python cli

Yunesss commented 1 year ago

You can run this as a microservice on a python lambda and get a json response that you can deserialize in javascript

55Cancri commented 1 year ago

I considered both of your suggestions but if I call another lambda from the first lambda, then the the client has to wait for two cold starts instead of one.