AgentLab is a framework for developing and evaluating agents on a variety of benchmarks supported by BrowserGym. This includes:
The framework enables the desing of rich hyperparameter spaces and the launch of parallel experiments using ablation studies or random searches. It also provides agent_xray, a visualization tool to inspect the results of the experiments using a custom gradio interface
This repo is intended for testing and developing new agents, hence we clone and install using the -e
flag.
git clone git@github.com:ServiceNow/AgentLab.git
pip install -e .
export AGENTLAB_EXP_ROOT=<root directory of experiment results> # defaults to $HOME/agentlab_results
export OPENAI_API_KEY=<your openai api key> # if openai models are used
export HUGGINGFACEHUB_API_TOKEN=<your huggingfacehub api token> # if huggingface models are used
agentlab-assistant --start_url https://www.google.com
Depending on which benchmark you use, there are some prerequisites
Create your agent or import an existing one:
from agentlab.agents.generic_agent.agent_configs import AGENT_4o
Run the agent on a benchmark:
study_name, exp_args_list = run_agents_on_benchmark(AGENT_4o, benchmark)
study_dir = make_study_dir(RESULTS_DIR, study_name)
run_experiments(n_jobs, exp_args_list, study_dir)
use main.py to launch experiments with a variety of options. This is like a lazy CLI that is actually more convenient than a CLI. Just comment and uncomment the lines you need or modify at will (but don't push to the repo).
While your experiments are running, you can inspect the results using:
agentlab-xray
You will be able to select the recent experiments in the directory
AGENTLAB_EXP_ROOT
and visualize the results in a gradio interface.
In the following order, select:
Once this is selected, you can see the trace of your agent on the given task. Click on the profiling image to select a step and observe the action taken by the agent.
Get inspiration from the MostBasicAgent
in agentlab/agents/most_basic_agent/most_basic_agent.py
Create a new directory in agentlab/agents/ with the name of your agent.
if you want to download HF models more quickly
pip install hf-transfer
pip install torch
export HF_HUB_ENABLE_HF_TRANSFER=1