readium / kotlin-toolkit

A toolkit for ebooks, audiobooks and comics written in Kotlin
https://readium.org/kotlin-toolkit
BSD 3-Clause "New" or "Revised" License
175 stars 74 forks source link

[Bug] More than one voice reproduced at a time #554

Open fabiorbap opened 1 month ago

fabiorbap commented 1 month ago

Describe the bug

When I try to reproduce a text in TTS on some Samsung devices, there is an additional voice that's reproduced along with another voice that I added. It seems that it's the default TTS voice from the device and not the one I selected to be executed. When the text is being read by the TTS, I can see that one of the voices being played is the right one, but Voice[Name: pt-BR-default, locale: por_BRA_default, quality: 300, latency: 300, requiresNetwork: false, features: []] (my device is in Portuguese) is also executed at the same time, and it shouldn't be.

At the same time, if I want the TTS to read a book in English, the voice that's executed is the default device voice (Voice[Name: pt-BR-default, locale: por_BRA_default, quality: 300, latency: 300, requiresNetwork: false, features: []]) and not the one I'm setting to the engine.

Screenshot 2024-08-02 at 16 56 50

What's weird is that it's not happening on all devices. This is happening in my Samsung Galaxy S20 Fe Android 13. I tested it in a Samsung A5 Android 8 (physical device), Xiaomi Pocophone F1 Android 10 (physical device), Pixel 2 Android 10 (emulator) and it works properly.

This started happening when upgrading org.readium.kotlin-toolkit from 2.3.0 to 3.0.0-beta.1.

These are the Readium dependencies in the project:

readium-shared = { module = "org.readium.kotlin-toolkit:readium-shared", version.ref = "readium" }
readium-streamer = { module = "org.readium.kotlin-toolkit:readium-streamer", version.ref = "readium" }
readium-navigator = { module = "org.readium.kotlin-toolkit:readium-navigator", version.ref = "readium" }

At this point, it seems that it's something specific to Samsung devices, and I have not found any way so far to fix it, please let me know any information that could help in fixing this.

How to reproduce?

To reproduce the duplicated voices

  1. Open a book
  2. Activate TTS
  3. Play the book
  4. If it's in English, it will use the device's default TTS voice; if a voice is added programmatically, there will be duplicated voices

Readium version

3.0.0-beta.1

Android API version

13

Additional context

I verified in the code and whenever pass a voice to the engine, it's always the correct voice. I tried somehow seeing if I could find a place in the code where I could remove the unwanted voice, or even setting the voice as the correct one, but it always shows the unwanted voice as well. This happens when I want to set a voice to a book that's not in English for example.

Also, I can see that there are 2 engines (this.engines) related to the TextToSpeech class, one is the Samsung one (incorrect) and the other is the Google one (correct), see screenshot. I imagine that this is related to this problem, but I couldn't find how to remove the Samsung one.

qnga commented 1 month ago

Hi! I'm afraid this will be very hard to debug as we don't have the specific device. The TTS engine wrapper is supposed to use one and only one TextToSpeech instance, associated with a specific TTS engine, at once. Do you have only one navigator instance?

I know you can pick up a specific engine provider when instantiating the TextToSpeech class. Maybe you can try to exclude the Samsung engine this way?

fabiorbap commented 3 weeks ago

@qnga Thanks for the reply!

Do you have only one navigator instance?

I'm not sure I understood, can you clarify?

I know you can pick up a specific engine provider when instantiating the TextToSpeech class. Maybe you can try to exclude the Samsung engine this way?

Thank you for the reference! In those links, it's said that the engine should be passed in the constructor, which is what I did, I only pass the Google engine.

val engine = TextToSpeech(context, initListener, "com.google.android.tts")

I also tried removing one of the engines like this, but it also didn't work, it causes no effect,

this.engines.let { it.removeAll(it.filter { engine -> engine.name != "com.google.android.tts" }) }

What's also awkward is that the TTS only displays the correct voices in this.voices, it doesn't display the voice from the Samsung engine, so there's not even a way to possibly remove the wrong voice.

When I debug it, I can see that there are always 2 engines active, no matter what I do.

qnga commented 3 weeks ago

A navigator instance presumably uses only one TTS engine, determined at construction-time. If two different engines are used, there can be several explanations:

Where does the engines property from your debugging screenshot come from? I'd like two understand where you see two engines exactly.

fabiorbap commented 3 weeks ago

That's a framework bug on that specific device.

It feels more and more that it's this because it doesn't happen to all devices, only Samsung ones, which in my experience tend to present bugs more than the other manufacturers.

Two different navigator instances with different underlying engines are used simultaneously.

It may be what the device I tested is doing

Where does the engines property from your debugging screenshot come from? I'd like two understand where you see two engines exactly.

When I create the TextToSpeech engine, I can access the engines through tts.engines

With Readium 2.3.0 this wasn't happening, but when changing to 3.0.0-beta and refactoring the code with the new updates it started happening

qnga commented 2 weeks ago

I see. TextToSpeech.engines lists all the available engines, so that should not be an issue if you have more than one engine. I'm afraid I can't help more.