mustafaaljadery / lightning-whisper-mlx

An extremely fast implementation of whisper optimized for Apple Silicon using MLX.
https://mustafaaljadery.github.io/lightning-whisper-mlx/
469 stars 21 forks source link

Enable kwargs in transcribe function #8

Open vilsonrodrigues opened 2 months ago

vilsonrodrigues commented 2 months ago

Hello Mustafa

you can add **kwargs in transcribe?

This would allow access to extra param in "transcribe_audio" as temperature, no_speech_threshold, etc

https://github.com/mustafaaljadery/lightning-whisper-mlx/blob/main/lightning_whisper_mlx/lightning.py#L90

fire17 commented 1 week ago

also reached the same conclusion...

half related - have you found out how to get speakerID? important in reatime conv over speaker, to ignore ai's inputs and also the best way to quickly trigger user-interruption to stop the current AI playback.

vilsonrodrigues commented 1 week ago

also reached the same conclusion...

half related - have you found out how to get speakerID? important in reatime conv over speaker, to ignore ai's inputs and also the best way to quickly trigger user-interruption to stop the current AI playback.

VAD + speech embed model + cosine similarity