CsabaConsulting / InspectorGadgetApp

Open Multi-Modal Perosnal Agent
MIT License
0 stars 0 forks source link

Introduce ReAct #5

Open MrCsabaToth opened 1 month ago

MrCsabaToth commented 1 month ago

Now that multiple function calling works (although with some quirks like https://github.com/google-gemini/generative-ai-dart/issues/194, or if I stuff more then a number of functions it loses sense of some of them), it'll be an interesting task to introduce ReAct in concert with it.

We need a prompt which keeps the ReAct loop, but encourages the native function execution.

The problem is that the original way ReAct works overlaps with the function calling capabilities. My problem is let's say a questions like "What's the weather today" would involve two function calls: one for determining the current location, and the other is to call the weather API with that location. The modern function call capable models are able to conclude to these two calls on their own without ReAct. ReAct without function calls would explicitly list these steps. The problem is that even today a ReAct prompt would result in explicit function calling plans, which would interfere with native function calling and parameter substitution.

Native function execution I'd expect could happen at the action phase, even multiple functions.

References:

Function calling without ReAct:

ReAct original (without function calling):

Forums, articles, repos; OpenAI, Pinecone:

Gemini related examples:

havkerboi123 commented 1 month ago

Can I work on this?

MrCsabaToth commented 1 month ago

Yes, it'd be great to have more people on board! It'd be good to talk, I'm on Discord, and other places. I'll actively work on issues as well. With ReAct particularly the finicky thing is that the pure ReAct is interfering with the newer Function calling capabilities. The best starting point could be the https://github.com/google-gemini/cookbook/blob/main/examples/Search_Wikipedia_using_ReAct.ipynb example, however I have many more tools.

The main goal is to submit the project to the https://ai.google.dev/competition Right now the project is Gemini oriented (using Chirp for STT and Google TTS if someone configures to not use the native Android STT / TTS), however in the future it'd be good to be LLM agnostic.

The effort got started off on the lines of the Humane AI Pin and the Rabbit R1, but as we know those are essentially also Android apps on embedded devices. This project can be used on Android phones, or I actually purchased a FAW (Full Android Watch) on AliExpress, there are good enough ones for $50 - that size kinda competes with an AI pin. On top of that this is Flutter, so it has even potential to run on an iOS device.

So let's chat, let me know what chat platform is best for you. On Discord I'm present in both the Gemini Meetup or the Google Developer Community, and numerous other Gen AI workspaces, like Meshy, Pika, Vectara, Flutter Dev, lablab.ai, Devpost. Also on Slack in Weaviate, ODSC Global, Flutter Community, Feats, Tecton, AICamp, ...

MrCsabaToth commented 1 month ago

I start to work on the Vector DB and history side of things. There will be code churn. The test code coverage is neglected right now, will catch up later.

havkerboi123 commented 1 month ago

Seems fun , lets connect on discord! @mehwz#4396

MrCsabaToth commented 1 month ago

I tried to send a friend request to mehwz#4396 but it didn't stick, then to mehwz, that request says mhmd. My username is @mrcsabatoth, "Originally known as @MrCsabaToth#8416".