Skyvern-AI / skyvern

Automate browser-based workflows with LLMs and Computer Vision
https://www.skyvern.com
GNU Affero General Public License v3.0
5.47k stars 395 forks source link

Geico Test not passing step 1 #128

Closed jamador47 closed 3 months ago

jamador47 commented 3 months ago

Hello team,

I have followed the installation instructions and have both

./run_skyvern.sh as well as ./run_ui.sh executing.

However, When trying to run ANY of the tests, It never gets passed step 1:

image

As you can see, it only creates the first step and then nothing happens, Can you please assist in debugging this issue?

jamador47 commented 3 months ago

Confirming this is occuring with ALL Test cases, not only Geico unfortunately :(

Also, no information on the terminal of either the UI or the backend launch:

$ ./run_ui.sh

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8503
  Network URL: http://10.150.93.142:8503

gio: http://localhost:8503: Operation not supported
2024-03-27 23:21:07 [info     ] Registering LLM config         llm_key=OPENAI_GPT4_TURBO
2024-03-27 23:21:07 [info     ] Registering LLM config         llm_key=OPENAI_GPT4V
2024-03-27T23:21:08.094782Z [info     ] Initializing ForgeAgent        browser_action_timeout_ms=5000 browser_type=chromium-headful debug_mode=False env=local execute_all_steps=True long_running_task_warning_ratio=0.95 max_scraping_retries=0 max_steps_per_run=50 video_path=./videos
2024-03-27T23:21:08.132909Z [info     ] Starting the skyvern scheduler.
2024-03-27T23:21:31.045859Z [info     ] Created new task               data_goal=Extract all quote information in JSON format including the premium amount, the timeframe for the quote. nav_goal=Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved proxy_location=NONE task_id=tsk_240082191796772254 title=None url=https://www.geico.com
2024-03-27T23:21:31.045980Z [info     ] Executing task using background task executor task_id=tsk_240082191796772254
2024-03-27T23:21:31.066310Z [info     ] Creating browser state for task task_id=tsk_240082191796772254
ykeremy commented 3 months ago

🤔 Can you try running playwright install and see if it helps?

jamador47 commented 3 months ago

Hello there @ykeremy I have tested this in a completely new server, FYI, im using UBUNTU as a base OS for the installation (not sure if anything to do with this) and got the exact same result. After the reinstallation, I tried running as is (just after running ./setup.sh, I ran ./run_skyvern.sh and ./run_ui.sh) and unfortunately received the same results:

image

After realizing I was exactly in the same position, I proceeded to do the playwright install using command: npx playwright install

Installation was successful, unfortunately, the issue remained.

After the installation, I re ran the ui and backend .sh and retried sending the geico from UI, a task was created, but only with step 0 once again:

image

Any other advises I could follow? Im running out of Ideas, was really expecting the fresh server install to fix this.

jamador47 commented 3 months ago

UPDATE:

Rewatching the video sample on the main github page, I noticed a browser window is being opened when clicking Execute. In my case, no browser is opened, what can cause this? Are there any log files I can refer to?

jamador47 commented 3 months ago

Update:

GOT IT RUNNING! - Important lesson here, you definitely need an OS running GUI, if not, the chromium task does not seem to trigger. I was running the other tests on servers with no GUI, and therefore got the results I presented previously. Due to my last comment mentioning the browser opening, this made total sense, this needs to be executed inside the browser, if the OS cannot open a browser due to not having a GUI, it would not be able to run the tests.

Therefore, I tested installing Ubuntu 22.04 with GUI, and.... IT WORKED! :)

ykeremy commented 3 months ago

OH PERFECT! I was just looking into this. If you want to run it without a GUI, you can run a headless browser or simulate a display adapter.

jamador47 commented 3 months ago

@ykeremy thank you very much for your assistance! this makes total sense now, just confirming, the headless browser configuration is done on the config.py file by changing BROWSER_TYPE: str = "chromium-headful" into BROWSER_TYPE: str = "chromium-headless" is this correct?

ykeremy commented 3 months ago

Yes, that would work but the best way to do it is to update/add a line to your .env file with BROWSER_TYPE= chromium-headless.

I'm closing the issue since it's resolved. Feel free to ask if you have any other questions. I'll be following the thread still