ciur / papermerge

Open Source Document Management System for Digital Archives (Scanned Documents)
https://papermerge.com
Apache License 2.0
2.5k stars 264 forks source link

Packaging for Archlinux User Repository #145

Closed amo13 closed 9 months ago

amo13 commented 4 years ago

As a follow-up to #135, I created this new issue in order to discuss a few things concerning packaging.

While building upon the efforts of @Lucki, a few questions came up for me:

  1. What exactly is this admin package thing and how is it going to be solved? It was brought up in #135 too.

  2. Because all the python dependencies are installed with pacman (archlinux system package manager), they are owned by root:root. This forces me to run the database migration (at least the first one after initial installation) as the privileged root user. Otherwise the migration fails with the message PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.8/site-packages/queue'. The latter folder did not exist yet at the time of the initial migration attempt with the papermerge user. Is there anything that can be done to avoid running the database migration as root? Maybe @Lucki is more fond of Archlinux best practices than I am and can give a advise?

  3. I added a post-upgrade hook to the package that is supposed to automatically run the database migration upon upgrade of the papermerge package. With the upgrades from 1.4.3 to 1.4.4 and to 1.4.5 I noticed the message:

    Running migrations:
    No migrations to apply.
    Your models have changes that are not yet reflected in a migration, and so won't be applied.
    Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.

    Do we have to intervene here or can this be ignored? I am not quite sure what the model changes mean and if I need to run the suggested commands.

Lucki commented 4 years ago

2. This forces me to run the database migration (at least the first one after initial installation) as the privileged root user.

You want to do that with the papermerge user and in /var/lib/papermerge. Maybe add it as ExecStartPre= in the systemd service. Don't do anything as root or in /usr/… after installation.

Edit: Actually that should already be set in the /etc/papermerge.conf.py… See the prepare()

ciur commented 4 years ago

What exactly is this admin package thing and how is it going to be solved? It was brought up in #135 too.

I keep that ticket open mostly to provide visibility that there is a Package for Archlinux AUR. Given this ticket, I think #135 is good to be closed. What @Lucki thinks about this?

Regarding last point, the one with migrations:

Do we have to intervene here or can this be ignored? I am not quite sure what the model changes mean and if I need to run the suggested commands.

That warning Your models have changes that are not yet reflected in a migration is because 3rd parties (python dependency modules, like dynamic_preferences, mptt etc) added changes to their model internals. In case of those warnings it is good to run makemigrations command before migrate command:

 $ ./manage.py makemigrations
 $ ./manage.py migrate

(actually I know that warning messages says that, but I repeated it here anyway)

so - yes, it is good to intervene in that case :+1:

So, guys @Lucki, @amo13 - shall I close ticket #135 ?

Lucki commented 4 years ago

shall I close ticket #135 ?

You can close it when setup.py includes the needed admin package, haven't seen a commit regarding that yet.

amo13 commented 4 years ago

Thank both of you for the help and advise!

@Lucki :

You want to do that with the papermerge user and in /var/lib/papermerge.

I did try to run it as the papermerge user and that is when the error occured: PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.8/site-packages/queue' Could this have happened only because I did not cd into /var/lib/papermerge before?

Maybe add it as ExecStartPre= in the systemd service.

I tried to use a papermerge.install file (with an install=papermerge.install statement in the PKGBUILD) with /bin/sh -c 'sudo -u papermerge papermerge-manage migrate' in post_install() and post_upgrade(). Do you think the systemd service file would be a better place for that? It would run each time the service starts and not only after package installation or upgrade.

@ciur :

so - yes, it is good to intervene in that case

Since I try to automate things, do you think it is sane to run both makemigrations and migrate after installation and after each upgrade (without knowing beforehand if that message would arise or not)?

amo13 commented 4 years ago

My current PKGBUILD and files: papermerge-pkgbuild.zip

Lucki commented 4 years ago

I don't think the papermerge user is already created when the post_install hook is triggered.

Please make sure your configuration file is correctly picked up. Papermerge tries to access the wrong directories while the correct ones should be already set in /etc/papermerge.conf.py (see prepare())

amo13 commented 4 years ago

I don't think the papermerge user is already created when the post_install hook is triggered.

Damn... you are absolutely right about that, I missed it! So that would be a good reason to run makemigrations, migrate and collectstatic as ExecStartPre= in the systemd service. @ciur just needs to confirm that it would be sane to run these before each start of gunicorn.

Still, when I got that error message, the papermerge user did already exist on my system (already installed and removed the package before).

Please make sure your configuration file is correctly picked up. Papermerge tries to access the wrong directories while the correct ones should be already set in /etc/papermerge.conf.py (see prepare())

I see the prepare() part of the PKGBUILD, but I don't understand what you mean by correctly picking up the config file and which directories papermerge would wrongly try to access...

prepare() {
  {
      echo 'DBDIR = "/var/lib/papermerge/database"';
      echo 'MEDIA_DIR = "/var/lib/papermerge/media"';
      echo 'STATIC_DIR = "/var/lib/papermerge/static"';
      echo 'TASK_QUEUE_DIR = "/var/tmp/papermerge/queue"'
  } >> "$pkgname-$pkgver/$pkgname.conf.py.example"

  # https://papermerge.readthedocs.io/en/latest/setup/server_configurations.html#step-1-install-gunicorn
  {
      echo 'from .base import *  # noqa';
      echo 'DEBUG = False';
      echo "ALLOWED_HOSTS = ['*']"
  } > "$pkgname-$pkgver/config/settings/production.py"

  # The admin package is missing somehow so let's move it plain stupid directly in place
  mkdir -p "$pkgname-$pkgver/build/lib/$pkgname/"
  cp -dpr --no-preserve=ownership "$pkgname-$pkgver/$pkgname/contrib" "$pkgname-$pkgver/build/lib/$pkgname/"
}
Lucki commented 4 years ago

"/var/tmp/papermerge/queue" != '/usr/lib/python3.8/site-packages/queue', your configuration file is either altered or not recognized by papermerge. https://papermerge.readthedocs.io/en/latest/settings.html

amo13 commented 4 years ago

"/var/tmp/papermerge/queue" != '/usr/lib/python3.8/site-packages/queue', your configuration file is either altered or not recognized by papermerge. https://papermerge.readthedocs.io/en/latest/settings.html

Ah, I understand! Thank you, I will try to investigate that!

amo13 commented 4 years ago

It looks like the issue with the altered or unrecognized config file just vanished... Everything looks fine now, except for running makemigrations before the migrate command:

That warning Your models have changes that are not yet reflected in a migration is because 3rd parties (python dependency modules, like dynamic_preferences, mptt etc) added changes to their model internals. In case of those warnings it is good to run makemigrations command before migrate command

@Lucki, maybe you know more about python than I do and can give another advise: Running papermerge-manage makemigrations as the papermerge user fails with the message PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.8/site-packages/papermerge/core/migrations/0020_auto_20201002_1628.py'.

Full stack trace ``` Migrations for 'core': /usr/lib/python3.8/site-packages/papermerge/core/migrations/0020_auto_20201002_1703.py - Alter field first_name on user Traceback (most recent call last): File "/usr/bin/django-admin", line 33, in sys.exit(load_entry_point('Django==3.1.1', 'console_scripts', 'django-admin')()) File "/usr/lib/python3.8/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line utility.execute() File "/usr/lib/python3.8/site-packages/django/core/management/__init__.py", line 395, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 330, in run_from_argv self.execute(*args, **cmd_options) File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 371, in execute output = self.handle(*args, **options) File "/usr/lib/python3.8/site-packages/django/core/management/base.py", line 85, in wrapped res = handle_func(*args, **kwargs) File "/usr/lib/python3.8/site-packages/django/core/management/commands/makemigrations.py", line 182, in handle self.write_migration_files(changes) File "/usr/lib/python3.8/site-packages/django/core/management/commands/makemigrations.py", line 220, in write_migration_files with open(writer.path, "w", encoding='utf-8') as fh: PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.8/site-packages/papermerge/core/migrations/0020_auto_20201002_1703.py' ```

I don't know why django attempts to write there and I am not quite sure how to go about this.

Lucki commented 4 years ago

you know more about python

Sry, I don't know anything meaningful about python, I hate languages that think it's important how you place non-seeable characters. My guess is that's that a path which isn't set in the configuration file. I don't know django either, maybe you can set the path with --migration-path=/path/sdsad or an environment variable DJANGO_MIGRATION_PATH=/path/sdsad. This is more a question for @ciur

ciur commented 4 years ago

Command makemigrations implies the fact that papermerge application must be able to write to 'site-packages' folder.

What Papermerge version you are running ? In what context does makemigrations run ? Is it within upgrade from 1.4.4 -> 1.4.5 ? I am asking as I want to reproduce on my dev environment to understand exactly where that 'new migrations file' is coming from.

amo13 commented 4 years ago

All the python dependencies are installed using the package manager and therefore the site-packages folder is owned by root. On a clean archlinux virtual machine (actually, I use the manjaro live iso for convenience) I install all the dependencies followed by papermerge 1.4.5. No upgrade whatsoever is done, but an initial installation, including the creation of a "papermerge" user. This user owns the directories /var/lib/papermerge/database, /var/lib/papermerge/media, /var/lib/papermerge/static and /var/tmp/papermerge/queue that are all configured to be used in /etc/papermerge.conf.py The goal on my side is to be able to run the migrations with the papermerge user. The error occurs when trying to run makemigrations as the papermerge user (before attempting to migrate, which works just fine as the papermerge user)

ciur commented 4 years ago

@amo13, the problem is that although papermerge user owns /var/lib/papermerge (project dir) - it uses python's system wide site-packages directory - /usr/lib/python3.8/site-packages/ - for which papermerge user has no write access.

Why don't you create a python virtual environment in /var/lib/papermerge/.venv with command like ?:

cd /var/lib/papermerge && virtualenv .venv -p python3.7 

In that case you will:

  1. use python3.7 version for papermerge
  2. all deps will be installed inside /var/lib/papermerge/.venv
  3. papermerge user will have write permissions on /var/lib/papermerge/.venv/lib/python3.7/site-packages which means ./manage.py makemigrations will be able to run.
amo13 commented 4 years ago

Thanks for your reply. I guess using the virtual env goes against the point of packaging because it is a different approach. Packaging and using packages is nice for modular management of software and its dependencies and for easy updates...

Do you know if there is some option/setting/environment variable like --migration-path=/path/sdsad or DJANGO_MIGRATION_PATH=/path/sdsad that could be set somewhere like @Lucki suggested?

ciur commented 4 years ago

@amo13, I am afraid there is no such thing as DJANGO_MIGRAION_PATH, as migrations must live in same dir as app (XYZ/papermerge/core/migrations/). Allow me some time to think over the problem and I will come back again to this issue (hopefully with a fresh look/a solution).

amo13 commented 3 years ago

I just came back to this packaging project for archlinux. My idea was for now to simply set the folder /usr/lib/python3.8/site-packages/papermerge/core/migrations to be recursively owned by the papermerge user, which would allow migrations to run correctly.

So far it seems to work fine, but when attempting to upgrade from 1.4.5 to 1.5.0, I encounter an issue with the migrations:

CommandError: Conflicting migrations detected; multiple leaf nodes in the migration graph: (0025_auto_20201002_1814, 0020_auto_20201029_1129 in core).
CommandError: Conflicting migrations detected; multiple leaf nodes in the migration graph: (0020_auto_20201029_1129, 0025_auto_20201002_1814 in core).
To fix them run 'python manage.py makemigrations --merge

Running the suggested command does fix the problem.

Now, I wonder if (for the sake of having usable and easily upgradable packages) you would deem it safe to just programatically run makemigrations --merge --noinput after each upgrade?

amo13 commented 3 years ago

I finally managed to package papermerge for archlinux, with some help from the arch community. The only thing I hope is not going to make any trouble is doing

makemigrations --merge --noinput
migrate --noinput
collectstatic --noinput

programmatically after installation and every package update. Would a user skipping an update get into trouble because of this?

I will test everything for a while and see if the next 2 or 3 updates install smoothly before submitting it to the arch user repository.

ciur commented 3 years ago

migrate --noinput collectstatic --noinput

migrate command is important for updates. Skipping migrate command means that update is incomplete.

migrate command is responsible for run updates on database schema i.e. table changes; without it update will apply changes only on application level. Let me give you an example to understand better. Imagine that in next version 1.6 document.file_name is changed to document.filename. I will change all code references from file_name to filename. However, to change database column (in a table called core_document) file_name to filename, a new migration will be created. Running that migration during update will assure that database table column core_document.file_name was renamed to core_document.filename and latest code will operate with latest database changes.

collectstatic is responsable for gathering static files (javascript, images, fonts, css) into a single place on file system. Usually it is used on server side deployments when application runs under real webserver (e.g. apache). The impact of not running collectstatic command depends on the server application runs from. How do you start application (i.e. what webserver is used)? with runserver command (built-in web server is used)?

amo13 commented 3 years ago

Thank you for your answer and the explanations above! What I would like to know though is if any harm or problem should be expected by running them unattended and non-interactively (--noinput) by the package manager after every update. Also, would it be a problem to run them multiple times for any reason? Finally, I also have to run makemigrations --merge --noinput as indicated above. I don't really know what this does, but is it ok to run this (first) after every update and with the --merge flag?

ciur commented 3 years ago

What I would like to know though is if any harm or problem should be expected by running them unattended and non-interactively (--noinput) by the package manager after every update.

No, it shouldn't be a problem.

Also, would it be a problem to run them multiple times for any reason?

Not at all. Actually running subsequently migrate or collectstatic will have no effect at all.

Finally, I also have to run makemigrations --merge --noinput as indicated above. I don't really know what this does, but is it ok to run this (first) after every update and with the --merge flag?

It is hard to answer this question, because I don't any experience with makemigrations --merge. Django documentation says:

makemigrations
....
--merge

Enables fixing of migration conflicts.
...

Thus... it will enable fixing of migration conflicts :grin:

amo13 commented 3 years ago

Awesome!

Closing this for now since I think I succeeded in packaging and you answered all my questions. Thank you!

amo13 commented 3 years ago

Just submitted the package to the AUR. Installing papermerge on Arch Linux is now as easy as yay -S papermerge. All dependencies are installed, migrations run, gunicorn is setup to serve papermerge and the suggested systemd services are created. I hope this can be helpful not only for users, but also for the project as a whole!

amo13 commented 3 years ago

I am running into new packaging issues since the release of 2.0.0. When I run ./run_tests.sh, I get the following error:

Traceback (most recent call last):
  File "/home/amo/PKGBUILDS/test-papermerge2/papermerge/./manage.py", line 24, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/lib/python3.9/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python3.9/site-packages/django/core/management/__init__.py", line 377, in execute
    django.setup()
  File "/usr/lib/python3.9/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/usr/lib/python3.9/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/usr/lib/python3.9/site-packages/django/apps/config.py", line 116, in create
    mod = import_module(mod_path)
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'papermerge.test.parts'

Any idea what I can do? Is this just because I do not use a pipenv? But if so, why did it work before 2.0.0?

amo13 commented 3 years ago

Retestet after 2.0.1 release, but the issue persists...

ciur commented 9 months ago

I am not sure what to do with this ticket. It is definitely not me to provide help regarding packaging with AUR. I was just wondering, why would you do that ? Why not just use available docker images?

Anyway, I am closing old/obsolete tickets and this ticket happens to be one of those :)