Closed goforbroke1006 closed 1 year ago
Update Bose by execute the following command: python -m pip install bose --upgrade
.
In the browser configuration, utilize the "close on crash" and "undetected" options:
class Task(BaseTask):
task_config = TaskConfig(
close_on_crash=True,
use_undetected_driver=True,
)
Also, selenium is crashing in Docker due to low memory increase it's memory and it should work. Also, Could you share your dockerfile?
No, I have 2.0.8 - it's not too old version. Yeah, I found this option (close_on_crash), thanks! But I guess important thing to notice somewhere in guides that webdriver required shm and if you run it with docker, you have to specify --shm-size. Because shm was reason of the InvalidSessionException.
@goforbroke1006 Was interested to see a selenium Docker file for learning purposes. Could you show it?
Yeah, sure! This config fits to my purposes:
FROM debian:bookworm-slim
RUN apt update && apt upgrade -y
RUN apt install -y curl unzip
RUN apt-get install python3.11 python3-pip python3.11-venv -y
RUN python3 --version
# https://packages.debian.org/sid/chromium
ARG CHROME_VERSION='114.0.5735.198-1'
ARG CHROMIUM_DEB_VERSION="${CHROME_VERSION}~deb12u1"
# http://chromedriver.storage.googleapis.com/
ARG CHROMEDRIVER_VERSION='114.0.5735.90'
RUN apt install -y \
chromium-common=$CHROMIUM_DEB_VERSION \
chromium-sandbox=$CHROMIUM_DEB_VERSION \
chromium=$CHROMIUM_DEB_VERSION
RUN mkdir -p /code/build/
RUN curl -O -L http://chromedriver.storage.googleapis.com/${CHROMEDRIVER_VERSION}/chromedriver_linux64.zip
RUN unzip ./chromedriver_linux64.zip -d /code/build/
RUN chmod -R 0777 /code/build/
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=utf-8
ENV PYTHONLEGACYWINDOWSSTDIO=utf-8
ENV ENV=production
WORKDIR /code/
ADD requirements.txt /code/requirements.txt
RUN python3 -m venv ./venv && . venv/bin/activate && pip3 install -r requirements.txt
COPY ./src /code/src
COPY ./launcher.py /code/launcher.py
COPY ./main.py /code/main.py
RUN echo '#!/bin/bash \n\
\n\
args=$* \n\
\n\
. venv/bin/activate \n\
python3 main.py ${args} \n\
' > /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT [ "/entrypoint.sh" ]
And compose like this:
version: "3.9"
services:
.base-task: &base-task
image: docker.io/goforbroke1006/my-awesome-project:latest
volumes:
- ./output:/code/output:rw
- ./profiles:/code/profiles:rw
- ./tasks:/code/tasks:rw
- ./local_storage.json:/code/local_storage.json:rw
- ./profiles.json:/code/profiles.json:rw
shm_size: "512Mb"
task1-scan-someting:
<<: *base-task
command: someting-one
task2-scan-someting:
<<: *base-task
command: someting-two
Thanks
Description
Each
selenium.common.exceptions.InvalidSessionIdException
error breaks execution ofbose.launch_tasks.launch_tasks
function.Steps to Reproduce
InvalidSessionIdException
insidetask.run(self, driver: BoseDriver, data: any)
TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'str'
Expected behavior: broken task can be finished normally
Actual behavior: broken task stops all process, next tasks will not executed
Reproduces how often: for sites with bot detection - 99% cases
Additional context
Can't reproduce on host machine. Only inside docker container.
Full stack-trace: