Open Nixellion opened 5 days ago
I've seen similar with multi-turn tool use myself, the logic you've describe for doing better makes sense to me too. The main time I've found it useful is for when I'm testing telling the model to do multiple things in one utterance which isn't common, but is a really useful lift in capability over other assistant devices (i.e. "Turn off the following lights: kitchen lights, living room lights, ...") but because it breaks things when a tool isn't selected it does make it basically unusable for a general setup.
I've been experimenting locally with adjusting HomeLLM to my needs. And one thing I realized is that I don't think Multi Turn conversation should be a checkbox. It can and should be auto detected. If "to_say" is empty after tools were processed, it means LLM output only tools and they were stripped out of the response. If needed, this can be made more strict, by comparing the original response before stripping tool calls and after.
If it's empty then in all cases, in my opinion anyway, the code should be requesting an LLM for confirmation speech.
Because, at least unless it's used specifically with HomeLLM trained model, there's always a chance that, regardless of the instructions, a model can output it either way - with some speech or just function calls without any speech text.
I ran experiments with and without Multi Turn check box, and here is what I got:
All of this cases are solved (in my testing at least) by removing the Multi Turn checkbox, and replacing it with logic described above.
I'm not doing a PR as the code is not in a state to be pushed and includes some other changes I made just for myself, but if needed I can recreate these changes from scratch and make a PR.