HENNGE / arsenic

Async WebDriver implementation for asyncio and asyncio-compatible frameworks
Other
349 stars 52 forks source link

How to setup chrome browser and chromedriver with Docker? #107

Open gavrilka opened 3 years ago

gavrilka commented 3 years ago

Using docker-compose to launch my app on Amazon Ubuntu 20.04. Tested on windows pycharm, all works great because i have Chrome installed my windows. Now i have to build my app on Amazon Ubuntu 20.04 server using docker-compose, i tried lots of different ways but stil can't make it work...

Here is my docker and part of python code:

My Dockerfile:

FROM python:latest

WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src

My docker-compose:

version: '3.1'

services:

  tgbot:
    container_name: bot
    build:
      context: .
    command: python app.py
    restart: always
    environment:
      WEBAPP_PORT: 3001
    env_file:
      - ".env"
    ports:
      - 8443:3001
    networks:
      - botnet
    volumes:
      - ./:/src

networks:
      botnet:
        driver: bridge

requirements.txt include arsenic~=20.9

from arsenic import get_session, keys, browsers, services
async def arsenic_scraper(url):
    service = services.Chromedriver() # here is driver, if empty - it should find from PATH, or path to driver
    browser = browsers.Chrome()
    async with get_session(service, browser) as session:
        await session.get(url)

FileNotFoundError: [Errno 2] No such file or directory: 'chromedriver'

dimaqq commented 3 years ago

@gavrilka as far as I can tell you are running a Python process and thus arsenic inside docker. Thus, chromedriver too must be installed inside Docker.

python:latest most likely resolves to this: https://github.com/docker-library/python/blob/a308725bfb9d588317cc8c1786f66368323d6581/3.9/buster/Dockerfile (assuming the docker server runs Linux)

Thus, my guess would be to add RUN apt-get install chromium-driver to your Dockerfile. Please test is locally first, because it's possible that $PATH may need to be adjusted.

gavrilka commented 3 years ago

@dimaqq unfortunatly, any command like: RUN apt-get install -y google-chrome-stable RUN apt-get install chromium-driver Returns error: The command '/bin/sh -c apt-get install -y google-chrome-stable' returned a non-zero code: 100 Trying to fix this. If i try RUN sudo .. - code: 127

dimaqq commented 3 years ago

you may need to apt-get update first to force the package manager to refresh download sources...

gavrilka commented 3 years ago

@dimaqq i tried this Dockerfile code:

FROM ubuntu:20.04
RUN apt-get update
RUN apt-get install -y google-chrome-stable

FROM python:latest
WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src

Result: Creating network "bots_botnet" with driver "bridge" Building tgbot Step 1/8 : FROM ubuntu:20.04 20.04: Pulling from library/ubuntu da7391352a9b: Pull complete 14428a6d4bcd: Pull complete 2c2d948710f2: Pull complete Digest: sha256:c95a8e48bf88e9849f3e0f723d9f49fa12c5a00cfc6e60d2bc99d87555295e4c Status: Downloaded newer image for ubuntu:20.04 ---> f643c72bc252 Step 2/8 : RUN apt-get update ---> Running in c8528d29128c Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [109 kB] Get:2 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB] Get:3 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [495 kB] Get:4 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [645 kB] Get:5 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [1167 B] Get:6 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [103 kB] Get:7 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB] Get:8 http://archive.ubuntu.com/ubuntu focal-backports InRelease [101 kB] Get:9 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB] Get:10 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB] Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB] Get:12 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB] Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [885 kB] Get:14 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [881 kB] Get:15 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [30.4 kB] Get:16 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [136 kB] Get:17 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [4250 B] Fetched 16.6 MB in 3s (6026 kB/s) Reading package lists... Removing intermediate container c8528d29128c ---> c24c5d5bee59 Step 3/8 : RUN apt-get install -y google-chrome-stable ---> Running in 703c8add1235 Reading package lists... Building dependency tree... Reading state information... E: Unable to locate package google-chrome-stable ERROR: Service 'tgbot' failed to build: The command '/bin/sh -c apt-get install -y google-chrome-stable' returned a non-zero code: 100

gavrilka commented 3 years ago

a bit update for today: Dockerfile

FROM ubuntu:20.04
RUN apt-get update; apt-get clean
# Add a user for running applications.
RUN useradd apps
RUN mkdir -p /home/apps && chown apps:apps /home/apps

# Install x11vnc.
RUN apt-get install -y x11vnc

# Install xvfb.
RUN apt-get install -y xvfb

# Install fluxbox.
RUN apt-get install -y fluxbox

# Install wget.
RUN apt-get install -y wget

# Install wmctrl.
RUN apt-get install -y wmctrl

RUN apt-get update && \
    apt-get install -y software-properties-common && \
    rm -rf /var/lib/apt/lists/*
# Set the Chrome repo.
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
# Install Chrome.
RUN apt-get update && apt-get -y install google-chrome-unstable && rm -rf /var/lib/apt/lists/*

FROM python:latest
WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src

first i also tried stable google-chrome chromedriver: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory

dimaqq commented 3 years ago

I think this is quickly becoming a "how to unstall chromedriver in Docker" issue and not arsenic... One thing you can try to do:

dimaqq commented 3 years ago

Selenium has some of these images and some documentation at https://github.com/SeleniumHQ/docker-selenium (seems a bit convoluted to me, and you don't need selenium, but good for inspiration maybe?)

dimaqq commented 3 years ago

Specifically for libnss: see e.g. https://qiita.com/mh4gf/items/e6e4551bcae0fb745ee8 TL;DR apt-get install -y google-chrome-stable libnss3 libgconf-2-4