niedev / RTranslator

Open source real-time translation app for Android that runs locally
Apache License 2.0
6.87k stars 519 forks source link

streaming transcription or push-to-talk #49

Open khimaros opened 5 months ago

khimaros commented 5 months ago

Thank you for wanting to share an idea! But before starting, ensure to check if this feature request respects the following requirements:

Is your feature request related to a problem? Please describe. transcription in walkie talkie mode does not work reliably in noisy environments. even with adjustments to microphone sensitivity, it never stops listening for input, which means translation never begins.

Describe the solution you'd like either offer a user control for when to start translating the buffer, or switch to a steaming mechanism so that input doesn't need to end before translation starts.

niedev commented 4 months ago

Hi, a streaming mechanism would be nearly impossible to do with current models, but I was already thinking about making a system to decide whether to use automatic or manual listening together with the new GUI in RTranslator 2.1, however, it won't be very soon (2 or 3 months probably). But maybe, before that, I could make sure that muting the microphone stops listening but still produces a transcription and a translation.

khimaros commented 4 months ago

utilizing the mute button as you describe would solve the problem well enough for now!

niedev commented 4 months ago

utilizing the mute button as you describe would solve the problem well enough for now!

The new release with this change is out! 🚀 Let me know how it works

khimaros commented 3 months ago

it is working well, thank you!

i wonder if you're aware of this: https://k2-fsa.github.io/sherpa/onnx/android/apk.html

niedev commented 3 months ago

@khimaros I didn't know it, I'll take a look, thanks!

niedev commented 2 months ago

@khimaros the new release now has the option to use push to talk 🚀