the-paperless-project / paperless

Scan, index, and archive all of your paper documents
GNU General Public License v3.0
7.84k stars 501 forks source link

Importer and then can't access original file #680

Closed clinis closed 4 years ago

clinis commented 4 years ago

I'm trying to restore my paperless setup, which I deleted by accident. Luckily, I constantly used the Exporter to keep a backup. I am using the docker setup.

I followed the instructions for the Importer:

  1. run docker-compose up to setup all the volumes and containers;
  2. run docker-compose run --rm webserver createsuperuser;
  3. run docker-compose run --rm consumer document_importer /export.

Here are the commands and results:

in ~/Documents/GitHub/paperless on master
$ docker-compose up
in ~/Documents/GitHub/paperless on master
$ docker-compose run --rm webserver createsuperuser
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
Running migrations:
  No migrations to apply.
Username (leave blank to use 'paperless'): clinis
Email address: 
Password: 
Password (again): 
Superuser created successfully.
in ~/Documents/GitHub/paperless on master
$ docker-compose run --rm consumer document_importer /export
Starting paperless_webserver_1 ... done
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
Running migrations:
  No migrations to apply.
fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/community/x86_64/APKINDEX.tar.gz
(1/1) Installing tesseract-ocr-data-por (4.1.0-r0)
OK: 304 MiB in 128 packages
Installed 11 object(s) from 1 fixture(s)
Encrypting 20170524154451-my-document.pdf and saving it to /usr/src/paperless/src/../media/documents/originals/0000001.pdf
Encrypting 20170524154451-my-document.pdf-thumbnail.png and saving it to /usr/src/paperless/src/../media/documents/thumbnails/0000001.png
in ~/Documents/GitHub/paperless on master
$ docker-compose up
paperless_webserver_1 is up-to-date
Starting paperless_consumer_1 ... done
Attaching to paperless_webserver_1, paperless_consumer_1
webserver_1  | Operations to perform:
webserver_1  |   Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
webserver_1  | Running migrations:
webserver_1  |   No migrations to apply.
webserver_1  | [2020-06-15 11:38:14 +0000] [1] [INFO] Starting gunicorn 19.9.0
[... normal webserver stuff ...]
webserver_1  | 127.0.0.1 - - [15/Jun/2020:11:40:59 +0000] "GET / HTTP/1.1" 301 0 "-" "curl/7.66.0"
consumer_1   | Traceback (most recent call last):
consumer_1   |   File "/usr/src/paperless/src/manage.py", line 11, in <module>
consumer_1   |     execute_from_command_line(sys.argv)
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/__init__.py", line 371, in execute_from_command_line
consumer_1   |     utility.execute()
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/__init__.py", line 365, in execute
consumer_1   |     self.fetch_command(subcommand).run_from_argv(self.argv)
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 288, in run_from_argv
consumer_1   |     self.execute(*args, **cmd_options)
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 332, in execute
consumer_1   |     self.check()
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 361, in check
consumer_1   |     all_issues = self._run_checks(
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/commands/migrate.py", line 58, in _run_checks
consumer_1   |     issues.extend(super()._run_checks(**kwargs))
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 351, in _run_checks
consumer_1   |     return checks.run_checks(**kwargs)
consumer_1   |   File "/usr/lib/python3.8/site-packages/django/core/checks/registry.py", line 73, in run_checks
consumer_1   |     new_errors = check(app_configs=app_configs)
consumer_1   |   File "/usr/src/paperless/src/documents/checks.py", line 28, in changed_password_check
consumer_1   |     if not GnuPG.decrypted(encrypted_doc.source_file):
consumer_1   |   File "/usr/src/paperless/src/documents/models.py", line 402, in source_file
consumer_1   |     return open(self.source_path, "rb")
consumer_1   | FileNotFoundError: [Errno 2] No such file or directory: '/usr/src/paperless/src/../media/documents/originals/0000001.pdf.gpg'
paperless_consumer_1 exited with code 0

If it helps, on the manifest.json file, the document has "storage_type": "unencrypted". I looked around the issues and it looks to be related to this one, but the error is slightly different and I am following the instructions.

What am I missing or whats wrong?

BTW: should I be using the Importer or the Restoring instructions? I tried both, but neither worked..

Tooa commented 4 years ago

From my understanding, the Importer and Exporter are the right utils for backing up and restoring your documents.

$ docker-compose run --rm consumer document_importer /export

Do you have the PASSPHRASE variable set when running the command with docker-compose? Because it looks like when you start the container something gets encrypted:

Encrypting 20170524154451-my-document.pdf and saving it to /usr/src/paperless/src/../media/documents/originals/0000001.pdf
Encrypting 20170524154451-my-document.pdf-thumbnail.png and saving it to /usr/src/paperless/src/../media/documents/thumbnails/0000001.png

and later in the stack trace, the application complains about these exact files:

consumer_1   | FileNotFoundError: [Errno 2] No such file or directory: '/usr/src/paperless/src/../media/documents/originals/0000001.pdf.gpg'

Because you run the commands with docker-compose, maybe your environment gets in your way. Can you test the process in an environment where encryption is disabled using docker run? Such as:

$ docker run --rm \
    --volume /mnt/user/appdata/paperless:/usr/src/paperless/data \
    --volume /mnt/user/scans/media:/usr/src/paperless/media \
    --volume /path/to/backup/place:/export \
    -e USERMAP_UID=99 -e USERMAP_GID=100 \
    paperless-consumer document_exporter /export
docker run --rm \
    --volume /mnt/user/appdata/paperless:/usr/src/paperless/data \
    --volume /mnt/user/scans/media:/usr/src/paperless/media \
    --volume /path/to/backup/place:/export \
    -e USERMAP_UID=99 -e USERMAP_GID=100 \
    paperless-consumer document_importer /export
clinis commented 4 years ago

Yep, it was getting PAPERLESS_PASSPHRASE from docker-compose.env. I commented it out, and the Importer worked! Thanks.