Baseline: pure LLM based agent

vrodriguezf commented 11 months ago

Build one agent purely based on text to see as a baseline to the hybrid agents that we expect to actually work better here. I would expect everything that has domain specific knowledge to work better than this.

This can be done in many ways, and using different LLMs. To start with, a gpt model (using the python openai package) should be the easiest thing to connect to. Here, we would have:

the prompt that contains the updated observation history.
the system prompt, that describes the context of the agent in text. I put one example in the drive folder, feel free to modify or create new ones and share them there too.
A parser that takes the response from the llm and build the action vector out of it. Asking the llm to prompt directly numbers to plug in as an action would prevent the llm from actually "reasoning" the action, and this wouldn't be useful even as a baseline.

I recommend you guys to watch , if you haven't done already, Jeremy Howard's tutorial on how to interact with llms to learn about the basic usage of the openai python API (don't remember if that includes system prompts though).

OhhTuRnz commented 11 months ago

At the start of the challenge i was playing around with a GPT interaction (had some difficulty parsing the 4D vector, maybe i can pipeline another GPT chat for "decoding" the values).

I also checked the interaction context box where he uses "vv" for making a shortened answer so that may work for parsing.

escharf320 commented 10 months ago

I started building an agent to do this. It's in my branch (called eli) under arclab_mit/agents/pure_text_agent. I still need to do a lot of work, but the framework is there. If you have any comments, feel free to let me know as I continue working!

vrodriguezf commented 10 months ago

At the start of the challenge i was playing around with a GPT interaction (had some difficulty parsing the 4D vector, maybe i can pipeline another GPT chat for "decoding" the values).

Yeah that's actually a good idea, to use a llm to parse the answer of a llm lol. It could be simpler though, probably making the system prompt say clearly that the final answer has to be given as a 4D vector could do the trick (same as it does with the "vv" trick of J. Howard's system prompt)

DumplingLife commented 10 months ago

Another approach we could take is have it call a function when it wants to return the action vector, using the function calling capabilities. I'll look into this and try to build an agent that does this

DumplingLife commented 10 months ago

I pushed an agent for this to main. It works end-to-end, so you can run it like a regular agent, but it often takes very long (30 seconds or more) and also sometimes calls the wrong function. I'll try mitigating these with better prompting

We can also reuse the function calling code to return the action vector in later agents as well, like the hybrid agents

vrodriguezf commented 10 months ago

Very cool! What file is that? If you push directly to the main branch without a PR, add the id of the related issue in the commit message. That way we'll have it linked directly in this conversation.

DumplingLife commented 10 months ago

it's here: arclab_mit/agents/jason_function_calling_llm_agent.py I think I'll make my own branch and do PRs from now on, so it's more organized.

ARCLab-MIT / kspdg

Baseline: pure LLM based agent #10