muellerberndt / mini-agi

MiniAGI is a simple general-purpose autonomous agent based on the OpenAI API.
MIT License
2.81k stars 294 forks source link

Thoughts on memory #40

Closed estiens closed 1 year ago

estiens commented 1 year ago

So I originally tried rewriting this in ruby just because I'm a Rubyist at heart but then of course become something a little different (over here BTW: https://github.com/estiens/ruby-coordinator-gpt - currently not using a coordinator, runs as this one does as one process)

I have been exploring two things - one is that embeddings aren't very accurate or useful until you have a ton of data (ie; i could see them being useful in a data ingestion scenario, but not sure they are nec needed for memories)

What I have had some luck with so far is that if we are going to send things off to be summarized anyway, telling the summarizer that it should offer advice if it sees the agent stuck in a loop or going down the wrong track and feel free to let it know if it successful or should choose another direction. I haven't seen any benefit and some downside of using embeddings to retrieve the context rather than trying to just continually provide the context and an overall evaluation of recent actions always kept in the messages

One of the interesting things about these projects for me so far is all of them seem to have both a learned helplessness that they don't have when being queried directly (ie googling about how to do basic programming tasks when it could ask itself due to the role it is playing) or a complete lack of awareness of capabilities and thinking it can just hallucinate commands and that they worked even if nothing happened

It seems that course correcting with an evaluator seems to help at least so far in limited testing

Guess I would say so far in my experience course correction and robust context of current task seems more important than best memory querying, though I suppose that would change if you were targeting a specific purpose (ie; ingesting a bunch of source code to code review etc)

estiens commented 1 year ago

(this is not a bug just a thought FWIW) thanks for the inspiration to play with creating a toy model!

muellerberndt commented 1 year ago

One of the interesting things about these projects for me so far is all of them seem to have both a learned helplessness that they don't have when being queried directly (ie googling about how to do basic programming tasks when it could ask itself due to the role it is playing) or a complete lack of awareness of capabilities and thinking it can just hallucinate commands and that they worked even if nothing happened.

It seems that course correcting with an evaluator seems to help at least so far in limited testing

Yep, I noticed all those things too. Loving the idea with the evaluator.

We have a large memory refactor coming in with #37 which will do away with vector database although I believe the implementation still uses embeddings. I'd say let's wait for this PR to be merged and then have a shot at adding an evaluator. Let's see if it improves performance.

Btw @estiens would you like to join our devs Telegram group? If yes hit me up via DM on Twitter for the link.

muellerberndt commented 1 year ago

@estiens what prompt would you use for the evaluator?

muellerberndt commented 1 year ago

I added an critic and am getting pretty good results. E.g.

 % python microgpt.py "Come up with an original idea for an online job that you can do autonomously. It is important that you can do this job with minimal support by the user. You must handle sourcing of clients, communication with clients, and payment by yourself. Proceed to find clients and perform the job. If you no need any support, such as creating an account, ask the user to help. All payments should go to the Paypal account bernhard.mueller11@gmail.com."
Working directory is /Users/bernhardmueller/microgpt
MicroGPT: Find information on how to create a service that can autonomously provide digital marketing services to clients.
Cmd: web_search, Arg: "autonomous digital marketing service"
Critic: Agent should not search the web, as it already has the necessary knowledge.
MicroGPT: Create an autonomous digital marketing service by developing a custom algorithm to handle client sourcing, communication, and payment processing.
Cmd: execute_python, Arg: "import random\nimport string\n\ndef generate_unique_id():\n    retur..."

Gonna merge this into main as an optional feature.

estiens commented 1 year ago

@muellerberndt I used a combo memory summarizer/evaluator and gave it its last action, pluc summaries of last 5, had it summarize them, and then told it should offer advice as necessary especially if it sees the agent looping, or trying the same command, etc. It will also remind it that it can look up it's commands if it sees it try to access one that isn't there yet etc - still playing with the proper balance of context/verbosity/etc

muellerberndt commented 1 year ago

That sounds like a smart approach. I just added this as an optional evaluator which does a so-so job, it's not very effective at detecting loops. Would be interesting to test your implementation also.

muellerberndt commented 1 year ago

I haven't seen any benefit and some downside of using embeddings to retrieve the context rather than trying to just continually provide the context and an overall evaluation of recent actions always kept in the messages. I ended up with the same conclusions. No longer using embeddings, but a ring buffer and a summary, and added an optional evaluator/critic.