Closed schwabenheinz closed 2 years ago
Hi @schwabenheinz, you may use the docker-compose.yaml
as reference for what you expect to do.
The problem is that you mount your files at the wrong location. As you can see in the yaml file below, it expects you to mount it at /in
and /out
.
I guess you are trying to provide environment variables for the ocrmypdf
base image which I don't know whether the ocrmypdf
binary does actually parse environment variables on startup. I'd guess that it does rather expect command line arguments that have to be passed via the OCRMYPDF_PARAMETER
environment variable to the watchdog application.
Create a file called docker-compose.yaml
.
version: '3'
services:
ocrmypdf-watchdog:
container_name: OCRmyPDF
network_mode: none
image: darthbermel/ocrmypdf-watchdog:latest
#build: . # this can be used in the directory with the Dockerfile in order to build the image locally and not to fetch it from dockerhub
restart: always
environment:
OCRMYPDF_IN: /in
OCRMYPDF_OUT: /out
WATCHDOG_FREQUENCY: 1
WATCHDOG_EXTENSIONS: pdf,jpg,jpeg,tif,tiff,png,gif
OCRMYPDF_BINARY: ocrmypdf
OCRMYPDF_PARAMETER: -l eng+fra+deu --rotate-pages --deskew --jobs 4 --output-type pdfa
volumes:
- /home/riedocker/hidrive/public/scans/input:/in
- /home/riedocker/hidrive/public/scans/fertig:/out
execute the following command:
docker compose up -d
I also do provide a watchdog application that uses this application's idea but tries the approach where you do not frequently check whether there are any new files in the folder but actually get notified by the file system that there are new files in the folder (GitHub.com/jxsl13/ocrmypdf-watchdog)
The container image you are trying to run is not this project but a completely different one: https://ocrmypdf.readthedocs.io/en/latest/batch.html#watched-folders-with-docker
Hello @jxsl13 yes you are right. I tried a lot of different way's and also packages, inbetween I mixed it up. I will follow your suggestion and come back after.
Hello @jxsl13 After I did it right, it works everything like expected Thank you very much for your help and sorry for my confusion!!
One additional question: In your documentation you have in parameter --frequency <in seconds and as environment WATCHDOG_FREQUENCY because of the parameter description I assumed the value for WATCHDOG_FREQUENCY is also in seconds. My intention was to scan each hour = each 3600 seconds the folder for new files. Was it right like I understood it? Thank you again in advance! Greetings from Germany Schwabenheinz
I do not know what documentation you are exactly referring to with In your documentation you have in parameter --frequency....
In this docker image/watchdog application you set the environment variable WATCHDOG_FREQUENCY: 3600
in seconds.
The base for my question was the README.md document in this project
I see.
can be closed.
I am starting to use OCRmyPDF on ubuntu server 20.04. For this I installed the docker container with following parameters: docker run \ -v /home/riedocker/hidrive/public/scans/input:/input \ -v /home/riedocker/hidrive/public/scans/fertig:/output \ -e OCR_OUTPUT_DIRECTORY_YEAR_MONTH=0 \ -e OCR_ON_SUCCESS_DELETE=1 \ -e OCR_DESKEW=1 \ -e ROTATE-PAGES=1 \ -e OUTPUT-TYPE=pdfa \ -e PYTHONUNBUFFERED=1 \ -u root:root \ -it --entrypoint python3 \ jbarlow83/ocrmypdf \ watcher.py
I think important to know is, that the both -v mounts are a linked/mounted to external hidrive-provider I tried to start without -u root:root. Second try was with. In both cases all files are available,. The container runs fine and in the logfiles are only few lines like:
Starting OCRmyPDF watcher with config: Input Directory: /input Output Directory: /output Output Directory Year & Month: False --> inside the container all files are available and everything seems fine.
But nothing happens. No scan startet/ no ocr process startet. Then I added in the config WATCHDOG_FREQUENCY=3600. But also without success. I could not find the reason for it - so every hint would be welcome. Thanks in advance