Open 55Cancri opened 1 year ago
you can make a wrapper over the C implementation of whisper then use exe to get json and combine those or just call python cli
You can run this as a microservice on a python lambda and get a json response that you can deserialize in javascript
I considered both of your suggestions but if I call another lambda from the first lambda, then the the client has to wait for two cold starts instead of one.
A node version will give access to a whole new segment of developers and allow word-level timestamps to happen directly inside of a node lambda after generating audio with openais recent tts api. By having this in javascript, it will also make it easier for js devs to create audio-text synchronization or "karaoke-style" highlighting in their user interfaces when reading text.