nextcloud / talk-ios

📱😀 Video & audio calls through Nextcloud on iOS
GNU General Public License v3.0
150 stars 91 forks source link

Voice message transcript resets during transcription #1269

Open e-caste opened 1 year ago

e-caste commented 1 year ago

Steps to reproduce

  1. Long press on a voice message
  2. Choose transcribe
  3. Choose a language

Expected behaviour

The transcript is generated in the new view.

Actual behaviour

The transcript is generated in the new view, but resets every sentence or so, in a way that only allows the user to quickly read the contents before they disappear.
(I'm transcribing Italian messages if that's relevant)

Device information

Device: iPhone X

iOS version: 16.5

Talk version: 16.0.2

Server information

Nextcloud version: (see admin overview page: /index.php/settings/admin/overview) 26.0.1

Talk version: (see apps admin page: /index.php/settings/apps) 16.0.4

Custom Signaling server configured: yes/no and version (see talk admin settings: /index.php/settings/admin/talk#signaling_server) no

Custom TURN server configured: yes/no (see talk admin settings: /index.php/settings/admin/talk#turn_server) yes

Custom STUN server configured: yes/no (see talk admin settings: /index.php/settings/admin/talk#stun_server) yes

Server log (data/nextcloud.log)

``` Insert your server log here ```
SystemKeeper commented 1 year ago

Hm, maybe the behaviour of iOS changed, but we get the information from the system "step by step" and only show what the system delivers us. Looks like this for me:

https://github.com/nextcloud/talk-ios/assets/1580193/fa10ba3b-a4d7-4048-b17f-dfcdef5b15f9

e-caste commented 1 year ago

@SystemKeeper try with a longer voice message, let's say 30 seconds and multiple sentences

SystemKeeper commented 1 year ago

Ugh.... I see 😅 Thanks for letting us now!

e-caste commented 1 year ago

No problem! Since we're talking about voice message transcription, I'm also curious as to why only "secondary" languages are selectable (my phone is set to English, then Italian, then German, and I can only choose the last two) image image

SystemKeeper commented 1 year ago

I tried to remember this the other day. So we are using on-device transcription only, that is no data is send to apples servers. To do that, the device needs to download the informatione once and can use it then. I am not really sure what triggered the download right now. It had something to do with siri and dictation, but not sure how it worked. I need to try it myself.

SystemKeeper commented 1 year ago

Looks like this an issue on apples side. The API should report the complete recognition, but suddenly the first part is missing without a way to detect the change properly. Only way I found around this issue is to disable partial updates, so only the full transcription is visible. Not as nice as with partial updates, but at least complete..