lavague-ai / LaVague

Large Action Model framework to develop AI Web Agents
https://docs.lavague.ai/en/latest/
Apache License 2.0
4.96k stars 421 forks source link

Add custom file upload action #406

Open lyie28 opened 3 days ago

lyie28 commented 3 days ago

Since Selenium cannot interact with a file upload dialog, I believe we'll need to create a custom action to handle file uploads correctly.

If the site uses an element is an input element with type file, we can usually send it a file path to upload like this:

elem = driver.find_element(By.CSS_SELECTOR, "input[type='file']")
elem.send_keys(file_path)
driver.find_element(By.ID, "file-submit").click()

However this does not work on all sites, so we we need to think how we can come up with something that works across a maximum amount of sites.

lyie28 commented 3 days ago

For any potential contributors, just to add details on how to modify the driver source code to add custom actions. You can do so with the following steps:

  1. Open the base.py file for your driver, e.g lavague-integrations/drivers/lavague-drivers-selenium/lavague/drivers/selenium/base.py

  2. Define your action in the list of actions in the SELENIUM_PROMPT_TEMPLATE below the The actions available are: line.

For example, you might add a new clearValue action:

Name: clearValue
Description: Focus on and clear the text of an input element with a specific xpath
Arguments:
  - xpath (string)

3, Add an elif clause to the exec_code method to handle your new action.

elif action_name == "clearValue":
                self.clear_value(
                    item["action"]["args"]["xpath"]
                )
  1. Finally add your new method for handling this action:
def clear_value(self, xpath: str,):
        elem = self.page.locator(f"xpath={xpath}").first
        elem.clear()
  1. Test by installing your driver from your modified local package:
    pip install -e lavague-integrations/drivers/lavague-drivers-selenium
adeprez commented 2 days ago

Your solution with send_keys(file_path) seems to work. Do you have examples of websites where it fails so we can explore alternatives?

lyie28 commented 2 days ago

I think I tried 5 sites - and it worked on 3/5. It didn't work for WeTransfer and the use case I shared with you that I've been working on this week