garyfeng / LaVague

Large Action Model framework to develop AI Web Agents
https://docs.lavague.ai/en/latest/
Apache License 2.0
0 stars 0 forks source link

FEA: Create a docker solution #1

Open garyfeng opened 1 month ago

garyfeng commented 1 month ago

Is your feature request related to a problem? Please describe. As a LaVague user I want to containerize the solution, so that I can do headless testing at scale.

Specifically:

Describe the solution you'd like

Describe alternatives you've considered Alternative to VNC would be to use X11, but that's complicated

Additional context

LaVague used to have a docker option earlier but they seemed to have prioritized pip over docker. Also refer to skyvern project, which has a similar but different goal, and they use the X11 approach

garyfeng commented 1 month ago

Follow https://docs.lavague.ai/en/latest/docs/contributing/general/#dev-environment to set up local env

garyfeng commented 1 month ago

You can use a web-based VNC viewer that doesn't require any installation. This is achievable by using a noVNC server, which allows you to access the VNC session via your browser. The official Selenium Docker images support noVNC, which makes this setup quite straightforward.

Here's how you can use noVNC with the Selenium Docker container:

Steps to Use noVNC with Selenium Docker

  1. Run the Selenium container with noVNC:

    The Selenium Docker images come with noVNC pre-installed and enabled. You just need to expose the noVNC port (typically port 7900) when you start the container.

    Run the following command to start a Selenium container with noVNC enabled:

    docker run -d -p 4444:4444 -p 5900:5900 -p 7900:7900 --shm-size="2g" selenium/standalone-chrome

    This command exposes:

    • Port 4444 for WebDriver (to run your Selenium tests),
    • Port 5900 for traditional VNC access,
    • Port 7900 for noVNC (web-based access).
  2. Access the browser via noVNC:

    Once the container is running, you can access the Selenium browser through noVNC in your web browser by visiting:

    http://localhost:7900

    When prompted for a password, use the default secret, unless you've changed it.

  3. Running your Selenium tests:

    After setting up the container, you can run your Selenium test as usual, and watch the browser in action via the noVNC session running in your web browser.

Conclusion

Using noVNC, you can view the browser in real-time directly from your web browser without any installation of a dedicated VNC viewer. This is the simplest way to monitor Selenium tests visually inside a Docker container.

garyfeng commented 1 month ago

Tring to recover the previous work by LaVague on docker, which seemed to have been deleted since March 2024:

garyfeng commented 1 month ago

content of the dockerfile from the above. They covered most of what I wanted from the above

FROM nvidia/cuda:12.3.2-devel-ubuntu22.04

ARG USERNAME=vscode
ARG USER_UID=1000
ARG USER_GID=$USER_UID

# Create the user
RUN groupadd --gid $USER_GID $USERNAME \
    && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
    && apt-get update \
    && apt-get install -y sudo \
    && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
    && chmod 0440 /etc/sudoers.d/$USERNAME

ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update && apt-get install -y ca-certificates fonts-liberation unzip \
libappindicator3-1 libasound2 libatk-bridge2.0-0 libatk1.0-0 libc6 \
libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgbm1 \
libgcc1 libglib2.0-0 libgtk-3-0 libnspr4 libnss3 libpango-1.0-0 \
libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 wget \
libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 \
libxrandr2 libxrender1 libxss1 libxtst6 lsb-release wget xdg-utils \
software-properties-common git zsh curl

# Install chrome and chromedriver for ubuntu22
RUN wget https://storage.googleapis.com/chrome-for-testing-public/122.0.6261.94/linux64/chrome-linux64.zip \
&& wget https://storage.googleapis.com/chrome-for-testing-public/122.0.6261.94/linux64/chromedriver-linux64.zip \
&& unzip chromedriver-linux64.zip -d /home/$USERNAME \
&& unzip chrome-linux64.zip -d /home/$USERNAME \
&& rm chrome-linux64.zip chromedriver-linux64.zip

# We need python3.10 to build the project
RUN add-apt-repository ppa:deadsnakes/ppa && apt-get update && apt-get install -y python3.10

# We need git to clone the repository, ensure it's installed
RUN apt-get update && apt-get install -y git

USER $USERNAME

# Clone the lavague repository
RUN git clone https://github.com/lavague-ai/LaVague /home/$USERNAME/LaVague

# Make localhost accessible externally
RUN sed -i "s/demo.launch(server_port=server_port, share=True, debug=True)/demo.launch(server_name='0.0.0.0', server_port=server_port, share=True, debug=True)/g" /home/$USERNAME/LaVague/src/lavague/command_center.py

RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10 \
&& python3.10 -m pip install virtualenv \
&& python3.10 -m pip install /home/$USERNAME/LaVague --force-reinstall \
&& python3.10 -m pip install llama-index-llms-azure-openai

WORKDIR /home/$USERNAME

# Copy the necessary files into the Docker image
COPY config.py /home/$USERNAME/config.py
COPY instructions.txt /home/$USERNAME/instructions.txt
COPY exec.sh /home/$USERNAME/exec.sh

RUN sudo chmod 755 /home/$USERNAME/exec.sh

# fix path problem
ENV PATH="/home/vscode/.local/bin:${PATH}"

# Modify the ENTRYPOINT to execute the lavague-launch command and keep the container running
COPY entrypoint.sh /home/$USERNAME/entrypoint.sh
RUN sudo chmod +x /home/$USERNAME/entrypoint.sh
ENTRYPOINT ["/home/vscode/entrypoint.sh"]
garyfeng commented 1 month ago

config.py:


import os
from llama_index.llms.openai import OpenAI

class LLM(OpenAI):
    def __init__(self):
        max_new_tokens = 512
        api_key = os.getenv("OPENAI_API_KEY")
        if api_key is None:
            raise ValueError("OPENAI_API_KEY environment variable is not set")
        else:
            super().__init__(api_key=api_key, max_tokens=max_new_tokens, temperature=0.0)

entrypoint.sh

#!/usr/bin/bash

# Execute your lavague-launch command
lavague-launch --file_path instructions.txt --config_path config.py

# Keep the container running
tail -f /dev/null

instructions.txt

https://huggingface.co/
Click on the Datasets item on the menu, between Models and Spaces
Click on the search bar 'Filter by name', type 'The Stack', and press 'Enter'
Scroll by 500 pixels
garyfeng commented 1 month ago

there is also a folder called docker-scripts https://github.com/lavague-ai/LaVague/tree/a47d5f1643c21b5179040dc36073c3d21870592c/examples/docker-scripts