lavague-ai / LaVague

Large Action Model framework to develop AI Web Agents
https://docs.lavague.ai/en/latest/
Apache License 2.0
5.26k stars 465 forks source link

Provide empirical tips on how to use LaVague #412

Open dhuynh95 opened 2 months ago

dhuynh95 commented 2 months ago

Someone tried to log in using LaVague using the following command:

from lavague.core import  WorldModel, ActionEngine
from lavague.core.agents import WebAgent
from lavague.drivers.selenium import SeleniumDriver

url = ...

selenium_driver = SeleniumDriver(headless=False)
world_model = WorldModel()
action_engine = ActionEngine(selenium_driver)
agent = WebAgent(world_model, action_engine)
agent.get(url)
agent.run("Enter '123@gmail.com' as username ")
agent.run("Enter '123' as Password ")
agent.run("Click on Login button") .

It did not work because the retriever did not fetch enough nodes so the retrieved elements were not sufficient to do what was required.

Here is the correct code:

from lavague.core import  WorldModel, ActionEngine
from lavague.core.agents import WebAgent
from lavague.drivers.selenium import SeleniumDriver

url = ...

selenium_driver = SeleniumDriver(headless=False)
world_model = WorldModel()
action_engine = ActionEngine(selenium_driver)

# Increase the number of retrieved elements
action_engine.navigation_engine.retriever.top_k = 10

agent = WebAgent(world_model, action_engine)
agent.get(url)

objective = """
Enter '123@gmail.com' as username
Enter '123' as Password
Click on Login button"""

agent.run(objective)

Two tips:

@lyie28 : Can you add these info in some part of the docs? I guess we should explain the tradeoff of higher top_k : higher can solve navigation errors (aka element not found) but with higher cost and latency as we provide more context to LLM

I guess we could add other tips.

paulpalmieri commented 1 month ago

@dhuynh95 how about a Tips page in the quickstart ?

Could contain stuff like:

lyie28 commented 1 month ago

A lot of this information is already in the docs. I will add anything missing but mainly I am reorganizing the information so it will be easier to find.