music-assistant / hass-music-assistant

Turn your Home Assistant instance into a jukebox, hassle free streaming of your favorite media to Home Assistant media players.
Apache License 2.0
1.4k stars 51 forks source link

MassPlayMediaOnMediaPlayer intent fails to identify media player target when only area is specified #3139

Closed gubsy420 closed 2 weeks ago

gubsy420 commented 2 weeks ago

What version of Music Assistant has the issue?

2.3.2

What version of the Home Assistant Integration have you got installed?

2024.11.1

Have you tried everything in the Troubleshooting FAQ and reviewed the Open and Closed Issues and Discussions to resolve this yourself?

The problem

Voice assist pipeline fails to play music via Music Assistant when the prompt only specified the area. Music will successfully play when the friendly name of the target speaker is specified. This behavior is corroborated by the output of the assist debugging developer tool in that when only an area is specified, no targets are shown and match=false. However, when a player is specified, a target is shown and match=true.

How to reproduce

Prompt voice assistant to "Play {Artist/Track/etc} in {Area Name}"

Music Providers

Music Assistant Spotify YouTube Music Filesystem (remote share)

Player Providers

Snapcast Playergroup

Full log output

No response

Additional information

Tested in a newly created area named "test" where the only entity associated with that area is a mass media player entity, and the only device is the one associated with that mass media player entity. Output of the assist dev tool is shown below.

Assist dev tool debug output when only area is specified:

intent:
  name: MassPlayMediaOnMediaPlayer
slots:
  query: 'the beatles '
  area: test
details:
  query:
    name: query
    value: 'the beatles '
    text: 'the beatles '
  area:
    name: area
    value: Test
    text: test
targets: {}
match: false
sentence_template: >-
  ((play|listen to) {query};(<on> <name>|<local_in> <area>|<on> [the ]{area}
  <player_devices>|<on> [the ]<player_devices> <local_in> <area>))
unmatched_slots: {}
source: custom
file: en/play_media_on_media_player.yaml

Assist dev tool debug output when the only media player in that area is specified:

intent:
  name: MassPlayMediaOnMediaPlayer
slots:
  query: 'the beatles '
  name: house
details:
  query:
    name: query
    value: 'the beatles '
    text: 'the beatles '
  name:
    name: name
    value: House
    text: house
targets:
  media_player.house:
    matched: true
match: true
sentence_template: >-
  ((play|listen to) {query};(<on> <name>|<local_in> <area>|<on> [the ]{area}
  <player_devices>|<on> [the ]<player_devices> <local_in> <area>))
unmatched_slots: {}
source: custom
file: en/play_media_on_media_player.yaml

What version of Home Assistant Core are your running

2024.10.4

What type of installation are you running?

Home Assistant OS

On what type of hardware are you running?

Raspberry Pi

OzGav commented 2 weeks ago

For some reason no entities are being listed for your area. See the targets key is empty.

gubsy420 commented 2 weeks ago

For some reason no entities are being listed for your area. See the targets key is empty.

Yes but it is only empty when the prompt specifies an area. In the example where the prompt specifies a player (that is in the same area as the other prompt) the targets are populated as expected.

OzGav commented 2 weeks ago

Yes I know but that is the key and the question is why in your setup. If I type “Play Dire Straits in the kitchen “ it works fine and I see the full list of entities.

gubsy420 commented 2 weeks ago

Yes I know but that is the key and the question is why in your setup. If I type “Play Dire Straits in the kitchen “ it works fine and I see the full list of entities.

I know that having multiple media players in the same area can cause problems, but these tests were run in an area where the media player in the 2nd example was the only entity associated with the area in general. I'm kind of at a loss of what to do. The docs don't really specify any unique config that is necessary for areas other than ensuring there's only one media player

OzGav commented 2 weeks ago

i don’t think this is a MA issue. For some reason the entities in your area aren’t being returned? Maybe @jozefKruszynski knows why?

gubsy420 commented 2 weeks ago

Just uninstalled the HACS integration, the addon and all of the associated data. Reinstalled everything and downgraded the HACS integration to 2024.10.0 and it is still exhibiting the same behavior. I am kind of at a loss as what to do at this point. I am confident it has nothing to do with my specific voice pipeline configuration, because the intent handling in the debug menu happens completely independently from all of that, and I get identical output in that menu regardless of if I have a MA specific conversation agent specified or not. As to if this is something that's gone wrong in HA or MA, I cannot say; however there is a thread in the Discord of someone having the same issue I am having

gubsy420 commented 2 weeks ago

i don’t think this is a MA issue. For some reason the entities in your area aren’t being returned? Maybe @jozefKruszynski knows why?

This is a shot in the dark because I'm grasping at straws at this point, but is it a possibility that it could be related to having my areas categorized by floor? The note on the voice assist docs that says:

"You cannot ask to play to an area if there is more than one media player in that area."

got me thinking that potentially somewhere in the backend it is grouping together all of the entities from all of the areas that are assigned to the same floor in Home Assistant, thus making it think that any given area has multiple media players in it even if it doesn't.

OzGav commented 2 weeks ago

I am not sure. I don't group by floor. Lets wait for Jozef as he has written this code so will have a better idea.

jozefKruszynski commented 2 weeks ago

I plan to do another large refactor on the intent code this afternoon, as so much has changed in intent handling since the last time I or anyone else did anything there. Let's see what happens after that.

gubsy420 commented 2 weeks ago

@jozefKruszynski Posting the conversation agent logs below just in case they might help

2024-11-08 10:50:16.657 DEBUG (MainThread) [homeassistant.components.conversation.default_agent] Created slot lists in 0.02 seconds 2024-11-08 10:50:17.877 DEBUG (MainThread) [homeassistant.components.conversation.default_agent] Recognize done in 1.22 seconds 2024-11-08 10:51:31.099 DEBUG (MainThread) [homeassistant.components.conversation.agent_manager] Processing in en: Play The Beatles in the office 2024-11-08 10:51:32.188 DEBUG (MainThread) [homeassistant.components.conversation.default_agent] Recognize done in 1.09 seconds 2024-11-08 10:51:32.188 DEBUG (MainThread) [homeassistant.components.conversation.default_agent] Recognized intent 'MassPlayMediaOnMediaPlayer' for template '((play|listen to) {query};(<on> <name>|<local_in> <area>|<on> [the ]{area} <player_devices>|<on> [the ]<player_devices> <local_in> <area>))' but had unmatched: [UnmatchedTextEntity(name='domain', text='<missing>', is_open=False)]

jozefKruszynski commented 2 weeks ago

I know exactly what this is now and it is fixed in the new PR Because our sentence has a mixed area/device context and also uses the requires_context domain "media_player" it simply cannot work for areas. New version has area and device in separate sentences, it also introduces a bunch of new sentences in a separate file for purely assist based usage with no need for LLM/special agent

gubsy420 commented 2 weeks ago

I know exactly what this is now and it is fixed in the new PR Because our sentence has a mixed area/device context and also uses the requires_context domain "media_player" it simply cannot work for areas. New version has area and device in separate sentences, it also introduces a bunch of new sentences in a separate file for purely assist based usage with no need for LLM/special agent

That's amazing!! I'll give it a test here in a sec. Appreciate it!

gubsy420 commented 2 weeks ago

@jozefKruszynski I pulled your most recent changes and tested them on my instance of HA and it does indeed resolve the issue with the intent not matching entities when the area is specified, but it seems that intent.py may not be handling the response as it should be. I am going to attached three files: my HA voice debug logs; the yaml response from the assist dev tool, as well has the yaml response from the home assistant built in voice assist debug workflow. One thing that stuck out to me was that it matched every entity that was in the office, not just media players, but maybe that is intended. conversation.json assist_dev_tool_response.txt home-assistant.log

jozefKruszynski commented 2 weeks ago

Haven't checked the files as I am not at home right now, however, it should only be looking for media_player domain devices. Having said that when matching areas it could well be looking at all devices. Thanks for checking and providing so much detail, it is highly appreciated

jozefKruszynski commented 2 weeks ago

@gubsy420 Thanks for your report, I just checked the function signature of async_match_states in HA core helpers intent, and we indeed do not pass a domain here, even though we can and absolutely should. I'll fix it as soon as I'm home.

OzGav commented 2 weeks ago

Please check with integration 2024.11.2

gubsy420 commented 2 weeks ago

@OzGav @jozefKruszynski Alright, so I tested the new integration version with these updates and I can confirm that the issue has been fixed, but I think this should come with an update to the docs as it was not completely straight forward:

I (wrongfully) assumed that I should include both custom sentences in my custom_sentences/en directory but when I did that, I would get python errors in the logs as well as bizarre or empty responses from the HA conversation agent when I also had the MA specific LLM selected in the integration. I took a look at both intents and saw they each applied to both area and name, with the difference being if it accepted a query or not, so that clued me in that I should only use one or the other depending on if I'm:

  1. Using just the HA conversation agent
  2. Using the HA conversation agent + MA specific LLM

Where music_assistant_PlayMediaAssist.yaml is to be used for scenario 1, and play_media_on_media_player.yaml is to be used for scenario 2, since the LLM formats the request in to a JSON query. After removing music_assistant_PlayMediaAssist.yaml from my custom_sentences folder, I was able to get music playing by either specifying area or media player name. Works great so far! I have tested both scenario 1 and scenario 2 and was able to get both scenarios to respond to both prompt types.

Side note related to the LLM/Voice Assist docs (and something that was part of why this didn't work immediately):

I have had little to no success when having my MA specific LLM set to "Assist". What I had found out prior to this particular issue is that it should be set to "No Control", otherwise in most cases it will attempt to play media by bypassing the custom intents; fail at doing so; and then provide a response saying it couldn't play music instead of returning a JSON formulated query. I had switched it to "Assist" upon the original issue happening, as a troubleshooting step, but forgot to switch it back off. This nuance isn't captured in the docs and I think it would be helpful for folks because it stumped me for quite a while. Also, for the sake of clarity, I am using a locally hosted instance of Ollama and not the OpenAI integration, so I cannot say with 100% certainty if everything will be the same using it, but I would assume that that is the case.

OzGav commented 2 weeks ago

Thanks. Yes the docs need to be updated, I am working on that now. I don’t use the full LLM option but I believe it should work out of the (black) box as it should see and understand the available intents and use them?

OzGav commented 2 weeks ago

Also you can use both custom sentences, I am doing that now. The intents are slightly different.

gubsy420 commented 2 weeks ago

Well, I am unable to for whatever reason. I just tried again with both in the folder and it didn't work; it could be a result of me using an LLM

gubsy420 commented 2 weeks ago

Well, so now I am able to have both custom sentences loaded while having the MA specific LLM assigned, so I guess the python errors prior were due to me not noticing that the LLM agent was set to "Assist". That being said, I have found that the new intent, music_assistant_PlayMediaAssist.yaml, requires different syntax of the prompt. I have been having to specify "Play artist The Beatles in the Office", as opposed to just "Play the Beatles in the Office"

OzGav commented 2 weeks ago

Correct. This differentiates the two pathways to resolve the request

gubsy420 commented 2 weeks ago

I'll keep both intents included for the time being and will report back if anything goes wrong, but if it doesn't then I would call this one pretty much solved. Thanks ya'll!

OzGav commented 2 weeks ago

Thanks. Please close if you are happy.