music-assistant / hass-music-assistant

Turn your Home Assistant instance into a jukebox, hassle free streaming of your favorite media to Home Assistant media players.
Apache License 2.0
1.32k stars 49 forks source link

TTS - time limit. #2382

Closed senderbell closed 3 months ago

senderbell commented 4 months ago

What version of Music Assistant has the issue?

2.1.0b1

What version of the Home Assistant Integration have you got installed?

2024.5.2

Have you tried everything in the Troubleshooting FAQ and reviewed the Open and Closed Issues and Discussions to resolve this yourself?

The problem

TTS is limited to about 10 seconds. All longer spoken texts are cut off. The same TTS played on the entity without the intermediary of Music Assistant is spoken in full. I made an integration with OpenAI Conversation and now the length of TTS matters. I tried to use Google Cloud TTS and Google translate TTS services.

How to reproduce

Just play a long TTS on the Music Assistant entity

Music Providers

n/a

Player Providers

UPnP/DLNA Player provider, Google chromecast player.

Full log output

No response

Additional information

No response

What version of Home Assistant Core are your running

2024.5.2

What type of installation are you running?

Home Assistant OS

On what type of hardware are you running?

Linux

OzGav commented 3 months ago

No I just sent a 45sec TTS message via the mini media player card to a picoreplayer. I pasted in a paragraph of text and it played fine. So something else is going on here. Try taking your text and play it as I have.

senderbell commented 3 months ago

I found the cause. In my case, it is enough to turn off Pre-announce TTS announcements on the speaker. When it is turned on, it cuts off the text, when it is not - it plays to the end.

OzGav commented 3 months ago

Ok let me try that tomorrow.

OzGav commented 3 months ago

Just tried this and confirmed I have the pre-announcement chime turned on. Playback to slimproto and chromecast devices worked fine. Please supply exact details as to how you are initiating the TTS playback.

senderbell commented 3 months ago

I use Google Chromecast devices, like google nest nimi and JBL Link Music. It doesn't matter whether they are in a group or not. TTS is triggered through developer tools - Text-to-Speech (TTS) services: Say a TTS message with google_cloud. Previously I also tried with Google Translate TTS but without result. Perhaps the problem lies in the language. During my attempts, I noticed that commas shorten the TTS duration, and removing them extends the duration. I know it doesn't make sense, but that's how it was. Speeding up the speech in TTS also helps. The faster it speaks, the more it says.

OzGav commented 3 months ago

It would help if you supplied the exact call so I can try and recreate it. I just did this with a large block of text that played successfully for about 15 seconds

service: tts.cloud_say
data:
    entity_id: media_player.ma_kitchen_speaker
    message: >-
        This is a large test…..

I then did it again and scattered eight commas throughout and that just made it even longer due to the pauses. This was sent to a Google Home speaker.

senderbell commented 3 months ago

When I get home, I will try with a long text in English and let you know. I see that you are using tts_cloud_say, which is probably only available through the HA cloud. I am using Google's tts, maybe that's the problem

OzGav commented 3 months ago

tts.google_say also worked for me

senderbell commented 3 months ago

I tried again and have new conclusions. I enabled TTS announce and played TTS on the chromecast speaker. After calling the service, the speaker turns on, then there's the announce chime, and only then does it start playing the text. In this case, TTS would cut off after 28 seconds. The next time, I manually turned on the speaker first, then called the TTS service. This time, the text was played until the end, but after 28 seconds, the speaker volume decreased. It played the end of the text more quietly. I tried this several times, and it was the same every time. After disabling TTS announce, all the problems disappear. It doesn't matter whether the speaker is turned on before or not. The volume remains constant the whole time.

It looks as if the Music Assistant reserves a certain amount of time for playing TTS, and the additional wait for turning on the speaker and playing extra sounds disrupts this plan. Otherwise, how would it know when to decrease the volume, and why does it do it at all?

service: tts.google_cloud_say data: entity_id: media_player.mi_smart_speaker9239_2 message: >- Long text (about 50 sec.)

OzGav commented 3 months ago

Announcements work subject to a bunch of settings for each player. Always the current queue will be paused. Then by default the announcement is played at a higher volume than the music that was playing. Thus there is a need to return the volume to the original value at the completion of the TTS.

MA knows (or should know) when to reduce the volume as it is controlling the TTS playback.

I will have to try and recreate what you are describing again as so far it has worked flawlessly for me. I understand you are using Chromecast players and you are targeting the MA entity (not the HA discovered one) in the service call.

senderbell commented 3 months ago

Yes, I am running TTS on a Chromecast speaker, which is an MA entity. I also tried it on a Windows computer via UPnP/DLNA, but the effect was the same. The situation with the volume that I described occurred when the music was not playing at all, all the speakers were turned off, which is why I found it strange that MA reduced the volume even during TTS playback. I have "Absolute volume" set and the minimum TTS volume at 15, but the volume reduction went down to 10. Maybe it's a specific problem related only to my configuration, maybe in the future, it will be possible to find some reasonable point of reference. For now, I have turned off TTS announce and it is fine.

OzGav commented 3 months ago

The volume reduction should be going back to whatever the player was at BEFORE the TTS call started. So the settings you are talking about have no relevance to what the volume returns to. Again I’ll keep trying to recreate this.

OzGav commented 3 months ago

Still can't reproduce this. I am using tts.cloud_say and tts.google_say and playing to a Google Home and a picoreplayer and it works perfectly everytime. I set both players up with absolute voume as you described and started with the player volume at 11. The text plays for well past 28seconds all the way to the end and the player volume adjusts as expected.

How have you setup your tts platforms? Mine is just the following in configuration.yaml:

tts:
  - platform: google_translate
    service_name: google_say
senderbell commented 3 months ago
tts:
  - platform: google_cloud
    key_file: 'tts.json'
    language: "pl-PL"
    voice: "pl-PL-Standard-B"
    encoding: "mp3"
    speed: "0.94"
    profiles:
      - 'telephony-class-application'

I tried different configurations, diferent languages even on the standard TTS platform: Google Translate, but without effect. Today, the TTS played through DLNA on a computer with Windows Media Player spoke the shortest - about 10 seconds. The Chromecast speaker lasted the longest - even about 40 seconds. Since no one else is reporting the problem, it means that the cause lies somewhere else in my configuration, and it is possible that it is not a Music Assistant issue.

OzGav commented 3 months ago

Yeah I am trying hard but I can't reproduce it. Can you try stripping the tts steup back to what I have and give that a go. That is about my last idea at the moment. I think you said it didn't matter if it was English or Polish so I am short of clues.

senderbell commented 3 months ago
tts:
  - platform: google_translate
    service_name: google_say

service: tts.google_say
data:
  entity_id: media_player.mi_smart_speaker9239_2
  message: >-
    blabla bla bla - long text
  language: pl

The TTS cut off the text after 40 seconds, just like on the previous configuration.

senderbell commented 3 months ago

The speaker simply turns off as if it were counting down time.

OzGav commented 3 months ago

I just tried the above and it worked fine just with a Polish accent! This will stay open until Marcel has a chance to comment or someone can replicate it.

marcelveldt commented 3 months ago

Please re-test with the 2.0.6 update, I found some bugs in the announcements code.

OzGav commented 3 months ago

@senderbell We will close this soon assuming it is fixed.

senderbell commented 3 months ago

Sorry for the delay. Great! Everything is fine. In my case, the problem has been fixed. Thank you for your work.

rysm83 commented 3 days ago

I seem to be having this same issue.
Im using snapcast clients to play with music assistant using the following script. if the text passed to the {{message}} is longer then 25ish characters the message will not be played. action: tts.cloud_say metadata: {} data: cache: true entity_id: media_player.tb310fu_3 message: "{{message}}" language: en-GB options: voice: RyanNeural