issues
search
normal-computing
/
fuji-web
Fuji is an AI agent that lives in your browser's sidepanel. You can now get tasks done online with a single command!
Apache License 2.0
191
stars
13
forks
source link
Evaluate WebWand on the WebArena dataset
#154
Open
lynchee-owo
opened
2 months ago
lynchee-owo
commented
2 months ago
Use WebArena benchmark.
Setup the standalone environment of WebArena
Configurate the urls for each website.
Generate config file for each test example and obtain the auto-login cookies for all websites
Write script to use WebArena's environment based on its
run.py
Save task execution results and evaluate.
Analyze the evaluation results
Use WebArena benchmark.