ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.35k stars 134 forks source link

Add simplistic Dockerfile #149

Open Fusl opened 5 years ago

localleon commented 5 years ago

After i builded the Container everything run as expected. Except the gs-server Part is not enabled. Maybe we could start gs-server on Container Runtime

Fusl commented 5 years ago

@localleon The gs-server runs with the same image but in another container, so:

To start the gs-server:

docker container run -d -p 29000:29000 --restart=unless-stopped --entrypoint gs-server IMAGENAME

And each grab-site job runs in its own Docker container:

docker container run --rm -d -e GRAB_SITE_HOST=172.17.0.1 --name "grab-site_$(cat /proc/sys/kernel/random/uuid)" -v /data:/data:rw IMAGENAME --igon --import-ignores /data/ignores URL
localleon commented 5 years ago

@Fusl Now I understand, works great. This pull request should make some changes to the readme to explain how to use the dockerfile.

bknowles commented 3 years ago

For me, this bombs out when I try to do a docker build with:

      gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/usr/include/ffi -I/usr/include/libffi -I/usr/local/include/python3.9 -c c/_cffi_backend.c -o build/temp.linux-x86_64-3.9/c/_cffi_backend.o
      c/_cffi_backend.c:15:10: fatal error: ffi.h: No such file or directory
         15 | #include <ffi.h>
            |          ^~~~~~~
      compilation terminated.
      error: command '/usr/bin/gcc' failed with exit code 1
      ----------------------------------------
  ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-6vliy31x/cffi_dc034e57ef524379affce4b3fdda0b3c/setup.py'"'"'; __file__='"'"'/tmp/pip-install-6vliy31x/cffi_dc034e57ef524379affce4b3fdda0b3c/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-f5drvudy/install-record.txt --single-version-externally-managed --prefix /tmp/pip-build-env-u9n2qtmo/overlay --compile --install-headers /tmp/pip-build-env-u9n2qtmo/overlay/include/python3.9/cffi Check the logs for full command output.
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python /usr/local/lib/python3.9/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-u9n2qtmo/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.6.0' wheel 'cffi>=1.12; platform_python_implementation != '"'"'PyPy'"'"'' Check the logs for full command output.
The command '/bin/sh -c apk add --no-cache git gcc libxml2-dev musl-dev libxslt-dev g++ re2-dev  && ln -s /usr/include/libxml2/libxml /usr/include/libxml  && pip3 install git+https://github.com/ludios/grab-site.git' returned a non-zero code: 1