Open sogand145 opened 4 months ago
I am experiencing the same issue of the process being killed upon bounding box detection with the following Dockerfile
FROM bitnami/pytorch
USER root
# Update container
RUN apt-get update
RUN apt-get upgrade -y
# Open GL
RUN apt-get install -y \
libgl1-mesa-glx \
libglib2.0-0
RUN rm -rf /var/lib/apt/lists/*
USER 1001
# Marker
RUN pip install marker-pdf
and Docker compose YAML
services:
pdf-service:
build:
context: .
dockerfile: build/PdfService.dockerfile
tty: true
ports:
- "80:8484"
volumes:
- <pwd>/cache:/app/cache
- <pwd>/src:/app/src
- <pwd>/out:/app/out
- ${HOME}/Downloads:/app/storage
environment:
- HF_HOME=/app/cache
- HOME=/app/cache
and attempting a single file parse in the container.
marker_single ./storage/<file> ./out
@sogand145, I have managed to get past this being killed business. This is a RAM-intensive tool, so the solution is to significantly increase the resources available to Docker. I cracked it open to 13.5GB RAM in the Docker Desktop settings, and added the following to my compose YAML. Now it successfully detects a few bounding boxes before encountering a new and exciting error ðŸ«
deploy:
resources:
limits:
memory: 12G
cpus: '6'
same problem here. running in windows server 2022, WSL2 Ubuntu Environment, memory limitations should not be an issue because its limited to 64GB per machine...
make_single blows the wsl up to more than 32GB of used memory with a 7.6MB PDF file with 500 pages an then its killed.
Hi, I'm using docker-compose to use marker in the container, but I get this error:
and this is dockerfile: `FROM python:3.9-bullseye
RUN apt-get update && apt-get upgrade -y
RUN apt install build-essential libpoppler-cpp-dev pkg-config python3-dev openjdk-11-jdk ghostscript ocrmypdf -y
ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64 ENV OCR_ENGINE=ocrmypdf ENV TORCH_DEVICE=cpu
RUN pip install marker-pdf ocrmypdf
WORKDIR /app
COPY ./by_marker-pdf /app/
CMD ["/bin/bash"]`
I should say there is no files in "marker" folder, I don't know how to change default values in settings.py I want to use ocrmypdf engine and also use cpu instead of gpu, how can I change default values?
Thanks in advance