niedev / RTranslator

Open source real-time translation app for Android that runs locally
Apache License 2.0
4.96k stars 367 forks source link

Not real-time #27

Closed wall3001 closed 4 days ago

wall3001 commented 6 days ago

It is not a real-time translation, just a sentence translation, when Vad is detected.

niedev commented 6 days ago

If you have a powerful phone and use Conversation mode with Bluetooth headphones you can speak freely as if the translation were in real time.

It is true that the translation occurs in pieces, but while a conversion piece is recognized and translated you can safely continue speaking and what you say next will be translated immediately afterwards (a queue is used), and the same goes for the other interlocutor, so in the end it's like having a real-time translation but with a delay.

If you want to reduce the delay you can reduce the end of voice timeout time in the app settings, this way shorter moments of silence will be enough to divide the translation (although the quality will be reduced).

hugo4004 commented 5 days ago

Walkie Talkie Mode's automatic microphone detection is not as friendly or accurate in determining the end of a sentence during a conversation. Can the UI be modified to be similar to the conversation mode in Google Translate, where the microphone icon can be clicked and held until the current conversation is finished?

niedev commented 4 days ago

Walkie Talkie Mode's automatic microphone detection is not as friendly or accurate in determining the end of a sentence during a conversation. Can the UI be modified to be similar to the conversation mode in Google Translate, where the microphone icon can be clicked and held until the current conversation is finished?

I will think about that, maybe implementing a way to decide whether to leave the automatic detection or do it manually.

However, for now I suggest you try to tweak the RTranslator settings to better calibrate the automatic detection. There are settings to decide the required duration of silence before the end of the listening, the sensitivity of the microphone and the duration of the recording advance before the activation of the listening.