music-assistant / hass-music-assistant

Turn your Home Assistant instance into a jukebox, hassle free streaming of your favorite media to Home Assistant media players.
Apache License 2.0
1.21k stars 44 forks source link

Make MassPlayMediaOnMediaPlayer work directly from the new LLMs in 2024.6 without the need of a separate conversation agent #2434

Closed tronikos closed 2 weeks ago

tronikos commented 3 weeks ago

Google Generative AI and OpenAI conversation agents in 2024.6 can call registered intents. The LLMs can already infer artist/track/album from the command and we can avoid the need of setting up a separate conversation agent to pass the query. This also saves an LLM request. I tried it with the examples in https://github.com/music-assistant/hass-music-assistant/blob/main/prompt/prompt.txt and the LLM was able to correctly pass the artist/track/album even for the "play the artist that composed the soundtrack of Inception" example. For the "play a list of 5 classic 80's rock tracks" it called MassPlayMediaOnMediaPlayer with query="list of 5 classic 80's rock tracks".

While I was here I refactored the code a bit to make it more readable and improved error handling.

OzGav commented 3 weeks ago

As a result of this change what information is sent to the LLM as a result of the request? Is it still just the prompt and the query?

jozefKruszynski commented 3 weeks ago

Yesterday I was trying to work on removing the need for a separate agent, and decided that it really needs the prompt to be able to return things as we expect them, however, I like the fact that you're supporting both use cases here.

I appreciate the clean up too, sometimes you simply can't see the wood for the trees, but it was definitely feeling messy overall.

I'll try test the changes later, but I like the direction. I'm also considering whether it makes sense to calling the exposed service rather than the api directly, but haven't made a decision here or there.

jozefKruszynski commented 3 weeks ago

Perhaps I'm too stupid to test this, but I can't get this to work reliably at all.

@tronikos Exactly how have you set things up for your testing?

tronikos commented 3 weeks ago

@OzGav With this change you only need to setup the separate agent for advanced commands e.g. "play a list of 5 classic 80's rock tracks". In that case the same prompt and query is sent to the LLM of that agent. But for most commands the LLM of the regular agent (default prompt and settings) e.g. for "play the artist that composed the soundtrack of Inception on family room display" will result to: Tool call: MassPlayMediaOnMediaPlayer({'artist': 'Hans Zimmer', 'name': 'Family room display'})

@jozefKruszynski for testing follow these steps:

  1. overwrite intent.py of your installation with the file here
  2. restart HA
  3. setup Google Generative AI conversation agent with the default settings
  4. setup a voice assistant to use the above agent
  5. expose media players, either manually or by selecting the checkbox in the configure page of the mass integration
  6. open assist, select the voice assistant from step 4, and send a command e.g. "play the artist that composed the soundtrack of Inception on family room display"
jozefKruszynski commented 2 weeks ago

Got it working, had to remove and re-add the integration and disable my open ai integration that has the prompt. I already disabled the open ai integration earlier, but for some reason the prompt was clearly still being used somehow.

OzGav commented 2 weeks ago

But for most commands the LLM of the regular agent

I just want to confirm that if we did this change it is still possible to send the minimal prompt+query to the LLM as I am not interested in sending all my house details to it (nor paying for that). I think based on what Jozef has said this is optional but want to confirm.

tronikos commented 2 weeks ago

correct

jozefKruszynski commented 2 weeks ago

I'll request some small changes later, but for the most part this looks good to me I think.

@tronikos are you one the discord server?

tronikos commented 2 weeks ago

Yes I'm on the discord server. My username is tronikos.

jozefKruszynski commented 2 weeks ago

I'll test again when I get home this evening

tronikos commented 2 weeks ago

I fixed the linter but it's in an unrelated file

marcelveldt commented 2 weeks ago

I fixed the linter but it's in an unrelated file

ah, probbaly from an earlier merge or a bump of ruff or whatever. Thanks!