Open peterpan192 opened 5 months ago
I think for bare metal installation you can ignore the docker-compose.yml
step. All it does is tell docker to allow paperless-ngx to be able to access the folder where the postprocessor script lives in the docker host, and you shouldn't need to do that in a bare metal installation. Just make sure that in paperless.conf
, the variable PAPERLESS_POST_CONSUME_SCRIPT
points to the [post_consume_script.sh](https://github.com/jgillula/paperless-ngx-postprocessor/blob/main/post_consume_script.sh)
file in the postprocessor git repo, and that the paperless user can read that directory (and execute the post-consume script).
For the one time setup script (setup_venv.sh
), you can just run it as the paperless
user in the directory where you checked out the postprocessor, i.e. something like:
cd /whichever/directory/you-checked-the/paperless-ngx-postprocessor/repo/out/
sudo -Hu paperless setup_venv.sh
Let me know if that works.
I think it kind of worked (at least I could run the sh-script) but when performing a dry-run with
sudo -Hu paperless /bin/bash -c 'source venv/bin/activate && ./paperlessngx_postprocessor.py --dry-run'
,
I get this error code:
[2024-06-09 17:51:31,412] [INFO] [paperlessngx_postprocessor] Doing a dry run. No changes will be made. Traceback (most recent call last): File "/opt/paperless/.local/lib/python3.11/site-packages/asgiref/local.py", line 89, in _lock_storage asyncio.get_running_loop() RuntimeError: no running event loop
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/connection.py", line 58, in getitem return getattr(self._connections, alias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/paperless/.local/lib/python3.11/site-packages/asgiref/local.py", line 118, in getattr return getattr(storage, key) ^^^^^^^^^^^^^^^^^^^^^ AttributeError: '_thread._local' object has no attribute 'default'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/opt/paperless/paperless-ngx-postprocessor/paperlessngx_postprocessor/get_auth_token.py", line 14, in get_auth_token cursor = connection.cursor() ^^^^^^^^^^^^^^^^^ File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/connection.py", line 15, in getattr return getattr(self._connections[self._alias], item)
File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/connection.py", line 60, in __getitem__
if alias not in self.settings:
^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/functional.py", line 47, in __get__
res = instance.__dict__[self.name] = self.func(instance)
^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/connection.py", line 45, in settings
self._settings = self.configure_settings(self._settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/db/utils.py", line 148, in configure_settings
databases = super().configure_settings(databases)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/utils/connection.py", line 50, in configure_settings
settings = getattr(django_settings, self.settings_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/conf/__init__.py", line 89, in __getattr__
self._setup(name)
File "/opt/paperless/.local/lib/python3.11/site-packages/django/conf/__init__.py", line 76, in _setup
self._wrapped = Settings(settings_module)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/.local/lib/python3.11/site-packages/django/conf/__init__.py", line 190, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
File "<frozen importlib._bootstrap>", line 1128, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
File "<frozen importlib._bootstrap>", line 1142, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'paperless'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/paperless/paperless-ngx-postprocessor/./paperlessngx_postprocessor.py", line 65, in <module>
api = PaperlessAPI(config["paperless_api_url"],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/paperless-ngx-postprocessor/paperlessngx_postprocessor/paperless_api.py", line 22, in __init__
auth_token = get_auth_token(paperless_src_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/paperless/paperless-ngx-postprocessor/paperlessngx_postprocessor/get_auth_token.py", line 27, in get_auth_token
raise RuntimeError(f"Couldn't find paperless-ngx's source code in {paperless_src_dir}")
RuntimeError: Couldn't find paperless-ngx's source code in /usr/src/paperless/src
Very helpful logs, thank you!
That last error message is the key: to interact with paperless-ngx over its REST API, the postprocessor needs an auth token. By default the postprocessor tries to get that auth token by pretending to be paperless itself and extracting it from paperless-ngx's auth database. But to do that, it needs to know where the source code for paperless-ngx is.
There may be two ways to solve this:
PNGX_POSTPROCESSOR_AUTH_TOKEN=<token>
value. Normally this would go in docker-compose.env
, but for a base metal install try putting it in paperless.conf
. You can get the auth token from paperless-ngx's django admin (e.g. http://localhost:8000/admin/authtoken/tokenproxy/).PNGX_POSTPROCESSOR_PAPERLESS_SRC_DIR=<directory>
value (also in paperless.conf
). I think the value you want is probably /opt/paperless/src
, but it will depend on where you put the source code for paperless-ngx.You can also provide the --auth-token
option when doing a dry-run.
Alright, I got one step closer. post_consume_script.sh
gets started when I upload a new document to my paperless-instance. In order for the script to run, I had to add user paperless
without being prompted for a password to the sudoers-file
. Only this way, the python-script will run effectively from the sh-script. I edited post_consume_script.sh
like this:
RUN_DIR=$( dirname -- "$( readlink -f -- "$0"; )" )
sudo -Hu paperless /bin/bash -c "source $RUN_DIR/venv/bin/activate && python $RUN_DIR/paperlessngx_postprocessor.py --auth-token xyz --rulesets-dir /opt/paperless/paperless-ngx-postprocessor/rulesets.d --process all"
However, this way the post-processing-script processes all files of my library every time I upload a new document. How can I achieve only the new uploaded document being processed?
Interesting. So does that mean paperless-ngx wasn't already running the post-consume script as the paperless
user? It's weird to me that you had to sudo as paperless in the post-consume script. But if that's what works, that's what works. 🙂
For processing only the new document: when paperless-ngx calls post_consume_script.sh
, one of the environment variables paperless-ngx sets should be DOCUMENT_ID
, which refers to the ID of the new document. That's how post_consume_script.sh
knows which document to process.
I would try changing that last line to:
sudo -HEu paperless /bin/bash -c "source $RUN_DIR/venv/bin/activate && python $RUN_DIR/paperlessngx_postprocessor.py --auth-token xyz --rulesets-dir /opt/paperless/paperless-ngx-postprocessor/rulesets.d process --document-id $DOCUMENT_ID
(Note the added -E
being passed to sudo
--this tells sudo
to preserve the environment variables, which you probably need for $DOCUMENT_ID
to get passed through)
I'm not sure. The script was triggered by paperless-ngx so it must have been executed as user paperless. However, nothing happened. Does not make sense to me either but it's working. ;-) Your suggestion with process --document-id $DOCUMENT_ID
works flawlessly. Thank you so much!
Hey, I'm struggling with the installation on my bare metal-installation on raspberry pi OS aarch64. There is no docker-compose.env, but I am pretty sure the equivalent in my system is "paperless.conf". However, I have not found out what could be the corresponding file to docker-compose.yml and how to get the one time setup script going. Any help is appreciated.