the-paperless-project / paperless

Scan, index, and archive all of your paper documents
GNU General Public License v3.0
7.85k stars 498 forks source link

Docker install from scratch doesn't run migrate #299

Open TeraHz opened 6 years ago

TeraHz commented 6 years ago

Just installed paperless for the first time from scratch in docker, and the createsuperuser step was failing due to missing tables. There was a warning about migrations that needed to be done. This was also causing the consumer container to fail to start. I ran the migrations and everything started up nicely.

Off to feed it some PDFs now to see how it does!

danielquinn commented 6 years ago

This is odd because there's a line in the Dockerfile to run migrations. I'm not much of a Docker pro, but I just did the following and it worked for me:

  1. Cloned this repo
  2. Added two files: docker-compose.env:

    PAPERLESS_PASSPHRASE=CHANGE_ME
    PAPERLESS_OCR_THREADS=16
    PAPERLESS_OCR_LANGUAGES=eng nld

    docker-compose.yml:

    version: '2'
    
    services:
        webserver:
            build: ./
            ports:
                - "8000:8000"
            volumes:
                - data:/usr/src/paperless/data
                - media:/usr/src/paperless/media
            env_file: docker-compose.env
            environment:
                - PAPERLESS_OCR_LANGUAGES=
            command: ["runserver", "--insecure", "0.0.0.0:8000"]
    
        consumer:
            build: ./
            volumes:
                - data:/usr/src/paperless/data
                - media:/usr/src/paperless/media
                - /tmp/paperless/consume:/consume
                - /tmp/paperless/export:/export
            env_file: docker-compose.env
            command: ["document_consumer"]
    
    volumes:
        data:
        media:
  3. Ran docker-compose up

This resulted in a lot of output, but importantly it included this section:

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying documents.0001_initial... OK
  Applying documents.0002_auto_20151226_1316... OK
  Applying documents.0003_sender... OK
  Applying documents.0004_auto_20160114_1844... OK
  Applying documents.0005_auto_20160123_0313... OK
  Applying documents.0006_auto_20160123_0430... OK
  Applying documents.0007_auto_20160126_2114... OK
  Applying documents.0008_document_file_type... OK
  Applying documents.0009_auto_20160214_0040... OK
  Applying documents.0010_log... OK
  Applying documents.0011_auto_20160303_1929... OK
  Applying documents.0012_auto_20160305_0040... OK
  Applying documents.0013_auto_20160325_2111... OK
  Applying documents.0014_document_checksum... OK
  Applying documents.0015_add_insensitive_to_match... OK
  Applying documents.0016_auto_20170325_1558... OK
  Applying documents.0017_auto_20170512_0507... OK
  Applying documents.0018_auto_20170715_1712... OK
  Applying reminders.0001_initial... OK
  Applying sessions.0001_initial... OK

...which is what creates the database. If your methodology was different, we can see what went wrong where, but at this point, I can't reproduce the problem 😞.

TeraHz commented 6 years ago

Right, and after that can you run docker-compose run --rm webserver createsuperuser ?

I'm following the steps listed here.

Here is my almost full session and configs:

[root@scans paperless]# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

[root@scans paperless]# docker images -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

[root@scans paperless]# cat docker-compose.env
# Environment variables to set for Paperless
# Commented out variables will be replaced by a default within Paperless.

# Passphrase Paperless uses to encrypt and decrypt your documents
PAPERLESS_PASSPHRASE=<MASKED>
PAPERLESS_OCR_THREADS=4
PAPERLESS_OCR_LANGUAGES=esp bul

[root@scans paperless]# cat docker-compose.yml
version: '2'

services:
    webserver:
        build: ./
        ports:
            - "8000:8000"
        volumes:
            - /nfs/paperless/data:/usr/src/paperless/data
            - /nfs/paperless/media:/usr/src/paperless/media
        env_file: docker-compose.env
        environment:
            - PAPERLESS_OCR_LANGUAGES=
        command: ["runserver", "--insecure", "0.0.0.0:8000"]

    consumer:
        build: ./
        volumes:
            - /nfs/paperless/data:/usr/src/paperless/data
            - /nfs/paperless/media:/usr/src/paperless/media
            - /nfs/paperless/consume:/consume
        env_file: docker-compose.env
        command: ["document_consumer"]

volumes:
    data:
    media:

[root@scans paperless]# docker-compose up -d
<snip build log>
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying documents.0001_initial... OK
  Applying documents.0002_auto_20151226_1316... OK
  Applying documents.0003_sender... OK
  Applying documents.0004_auto_20160114_1844... OK
  Applying documents.0005_auto_20160123_0313... OK
  Applying documents.0006_auto_20160123_0430... OK
  Applying documents.0007_auto_20160126_2114... OK
  Applying documents.0008_document_file_type... OK
  Applying documents.0009_auto_20160214_0040... OK
  Applying documents.0010_log... OK
  Applying documents.0011_auto_20160303_1929... OK
  Applying documents.0012_auto_20160305_0040... OK
  Applying documents.0013_auto_20160325_2111... OK
  Applying documents.0014_document_checksum... OK
  Applying documents.0015_add_insensitive_to_match... OK
  Applying documents.0016_auto_20170325_1558... OK
  Applying documents.0017_auto_20170512_0507... OK
  Applying documents.0018_auto_20170715_1712... OK
  Applying reminders.0001_initial... OK
  Applying sessions.0001_initial... OK
Removing intermediate container 5b34134a2d89
 ---> 2fe87bf94a2c
Step 10/13 : WORKDIR /usr/src/paperless/src
Removing intermediate container a8cde7e97538
 ---> 76fed0c00539
Step 11/13 : VOLUME ["/usr/src/paperless/data", "/usr/src/paperless/media", "/consume", "/export"]
 ---> Running in 12470e2aafaf
Removing intermediate container 12470e2aafaf
 ---> 4770a5bb9dec
Step 12/13 : ENTRYPOINT ["/sbin/docker-entrypoint.sh"]
 ---> Running in 2098d09cbfe1
Removing intermediate container 2098d09cbfe1
 ---> 664cd977b210
Step 13/13 : CMD ["--help"]
 ---> Running in 8b258446b0fb
Removing intermediate container 8b258446b0fb
 ---> 0a7a6fcffc48
Successfully built 0a7a6fcffc48
Successfully tagged paperless_webserver:latest
WARNING: Image for service webserver was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
<snip build log>
Successfully built 0a7a6fcffc48
Successfully tagged paperless_consumer:latest
WARNING: Image for service consumer was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Creating paperless_webserver_1 ... done
Creating paperless_consumer_1  ... done

[root@scans paperless]# docker logs paperless_consumer_1
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/community/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/community/x86_64/APKINDEX.tar.gz
(1/1) Installing tesseract-ocr-data-bul (3.05.01-r2)
OK: 288 MiB in 117 packages
Starting document consumer at /consume
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such table: documents_log

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/paperless/src/manage.py", line 18, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/src/paperless/src/documents/management/commands/document_consumer.py", line 52, in handle
    "Starting document consumer at {}".format(settings.CONSUMPTION_DIR)
  File "/usr/lib/python3.6/logging/__init__.py", line 1306, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/lib/python3.6/logging/__init__.py", line 1442, in _log
    self.handle(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 1452, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 1514, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 863, in handle
    self.emit(record)
  File "/usr/src/paperless/src/documents/loggers.py", line 23, in emit
    Log.objects.create(**kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 394, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/src/paperless/src/documents/models.py", line 319, in save
    models.Model.save(self, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 808, in save
    force_update=force_update, update_fields=update_fields)
  File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 838, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 924, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 963, in _do_insert
    using=using, raw=raw)
  File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 1076, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1112, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 79, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such table: documents_log

[root@scans paperless]# docker logs paperless_webserver_1

[root@scans paperless]# docker-compose run --rm webserver createsuperuser

You have 32 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, contenttypes, documents, reminders, sessions.
Run 'python manage.py migrate' to apply them.

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such table: auth_user

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/paperless/src/manage.py", line 18, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/lib/python3.6/site-packages/django/contrib/auth/management/commands/createsuperuser.py", line 63, in execute
    return super(Command, self).execute(*args, **options)
  File "/usr/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python3.6/site-packages/django/contrib/auth/management/commands/createsuperuser.py", line 96, in handle
    default_username = get_default_username()
  File "/usr/lib/python3.6/site-packages/django/contrib/auth/management/__init__.py", line 148, in get_default_username
    auth_app.User._default_manager.get(username=default_username)
  File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 374, in get
    num = len(clone)
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 232, in __len__
    self._fetch_all()
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 1118, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 53, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch)
  File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 899, in execute_sql
    raise original_exception
  File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 889, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 79, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such table: auth_user

[root@scans paperless]# docker-compose run --rm webserver migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, documents, reminders, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying documents.0001_initial... OK
  Applying documents.0002_auto_20151226_1316... OK
  Applying documents.0003_sender... OK
  Applying documents.0004_auto_20160114_1844... OK
  Applying documents.0005_auto_20160123_0313... OK
  Applying documents.0006_auto_20160123_0430... OK
  Applying documents.0007_auto_20160126_2114... OK
  Applying documents.0008_document_file_type... OK
  Applying documents.0009_auto_20160214_0040... OK
  Applying documents.0010_log... OK
  Applying documents.0011_auto_20160303_1929... OK
  Applying documents.0012_auto_20160305_0040... OK
  Applying documents.0013_auto_20160325_2111... OK
  Applying documents.0014_document_checksum... OK
  Applying documents.0015_add_insensitive_to_match... OK
  Applying documents.0016_auto_20170325_1558... OK
  Applying documents.0017_auto_20170512_0507... OK
  Applying documents.0018_auto_20170715_1712... OK
  Applying reminders.0001_initial... OK
  Applying sessions.0001_initial... OK

[root@scans paperless]# docker-compose run --rm webserver createsuperuser
<snip user creation>

[root@scans paperless]# docker ps -a
CONTAINER ID        IMAGE                 COMMAND                  CREATED             STATUS                     PORTS                    NAMES
68125e674ab7        paperless_webserver   "/sbin/docker-entryp…"   6 minutes ago       Up 6 minutes               0.0.0.0:8000->8000/tcp   paperless_webserver_1
adaae3156d35        paperless_consumer    "/sbin/docker-entryp…"   6 minutes ago       Exited (1) 6 minutes ago                            paperless_consumer_1

[root@scans paperless]# docker start paperless_consumer_1
paperless_consumer_1

[root@scans paperless]#  git status
# On branch master
nothing to commit, working directory clean

[root@scans paperless]# git pull
Already up-to-date.

[root@scans paperless]# 

Not sure why it happens. I'm on CentOS 7.4 and docker:

Client:
 Version:   17.12.0-ce
 API version:   1.35
 Go version:    go1.9.2
 Git commit:    c97c6d6
 Built: Wed Dec 27 20:10:14 2017
 OS/Arch:   linux/amd64

Server:
 Engine:
  Version:  17.12.0-ce
  API version:  1.35 (minimum version 1.12)
  Go version:   go1.9.2
  Git commit:   c97c6d6
  Built:    Wed Dec 27 20:12:46 2017
  OS/Arch:  linux/amd64
  Experimental: false
danielquinn commented 6 years ago

I just tried this and it's working right for me. The consumer starts as its supposed to and adding a user runs without issue. Afterward, I checked to make sure everything was still running and it was.

This is very odd. Looking at your output, the migration step runs every time the consumer starts, but it's creating the tables every time. Lines like Applying reminders.001_initial... OK should only ever appear once, and yet it's in your output twice. I can only conclude that there's a problem somewhere in actually creating your database.

I see in your .yml file, you've got the data directory mounted via /nfs/paperless/data. Can you confirm that there's a file called db.sqlite3 in that directory? If yes, what're its permissions and can you open it with sqlite db.sqlite3? Maybe run .schema whilst in there?

Have you tried running this with data:/usr/src/paperless/data in there instead of /nfs/paperless/data:/usr/src/paperless/data? In this case the db.sqlite3 file should be created in <Paperless project root>/data/.

TeraHz commented 6 years ago

Very odd indeed. When I run docker-compose up for the first time, the db.sqlite3 file is created with the right permissions and ownership but it has only this:

sqlite> .schema
CREATE TABLE "django_migrations" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "app" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "applied" datetime NOT NULL);

And that table is empty.

When I run docker-compose run --rm webserver migrate it creates all tables and indexes properly.

I'm also not a Docker expert, but as far as I can tell, that migration in the Dockerfile is happening at image creation time, but the data path is later overridden by the VOLUME when the container is created. I think the migration needs to happen in the entrypoint, not in the dockerfile. You'll need to have some more logic in the entrypoint file since it is shared between the two containers and some logic for the consumer container to wait for the webserver container to finish migrations. I think docker-compose format 2.1 added healtcheck options that might be used for that.

danielquinn commented 6 years ago

Alright, I think I'm following, but I'm going to pull in @pitkley and @addadi as they've been tinkering with the Docker stuff of late. Perhaps this is a bug that we should address, or just an edge case that needs documenting, but I think they should know what's up.

pitkley commented 6 years ago

Yeah, I think @TeraHz is correct in that the created database gets overridden by the mount.

I'll try to take a look into what a potential solution might look like, but it is probably going to take me until the weekend. Maybe @addadi or somebody else gets to it earlier. ☺️

addadi commented 6 years ago

I'll check that but unfortunately i won't be able to look at it until the weekend...