Closed agateblue closed 4 years ago
Hm, you're right, if we expect people using Django to use procrastinate, we'll need to ship a way to transform PG migrations SQL files into Django migrations. Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration). But that's a detail, and doesn't help your problem.
Can I ask why you used a RunPython and not a RunSQL ?
I can try to have a look, but I don't have a clue at this point why this fails. Or you can have a look on your side. Or we can do this in pair programming if you want, like in the good ol' days :)
I'm perfectly fine adding a Django submodule for simplifying the usage of procrastinate with django, if it's deemed preferable to a django-procrastinate dedicated package.
Hm, you're right, if we expect people using Django to use procrastinate, we'll need to ship a way to transform PG migrations SQL files into Django migrations. Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration). But that's a detail, and doesn't help your problem.
I'm not sure about the "1 procrastinate migration <-> 1 django migration" approach. As far as I can tell, it doesn't bring any additional value and make integration with the app way harder (a dozen files to add, instead of only one).
Can I ask why you used a RunPython and not a RunSQL ?
I can try that, I wanted to avoid file I/O at import time, which RunSQL would require.
IMHO, the easiest option would be as you suggested, have a contrib.django
package which would be a django app, and contain the proper migrations. Integrating procrastinate
in a django project would then only require INSTALLED_APPS = [..., "procrastinate.contrib.django", ...]
and python manage.py migrate
.
However, it would increase the maintenance burden on procrastinate side, possibly by having duplicated SQL code (though it could probably be automated).
If I use RunSQL
, I end up with the same error. This is the code I used:
# Generated by Django 3.0.8 on 2020-07-18 08:39
import os
from django.db import migrations
def get_sql(name):
import procrastinate
sql_path = os.path.join(
os.path.dirname(procrastinate.__file__),
'sql',
'migrations',
)
full_path = os.path.join(sql_path, name)
with open(full_path, 'rb') as f:
return f.read().decode()
class Migration(migrations.Migration):
dependencies = [
('common', '0008_auto_20200701_1317'),
]
operations = [
migrations.RunSQL(get_sql("baseline-0.5.0.sql")),
migrations.RunSQL(get_sql("delta_0.5.0_001_drop_started_at_column.sql")),
migrations.RunSQL(get_sql("delta_0.5.0_002_drop_started_at_column.sql")),
migrations.RunSQL(get_sql("delta_0.5.0_003_drop_procrastinate_version_table.sql")),
migrations.RunSQL(get_sql("delta_0.6.0_001_fix_procrastinate_fetch_job.sql")),
migrations.RunSQL(get_sql("delta_0.7.1_001_fix_trigger_status_events_insert.sql")),
migrations.RunSQL(get_sql("delta_0.8.1_001_add_queueing_lock_column.sql")),
migrations.RunSQL(get_sql("delta_0.10.0_001_close_fetch_job_race_condition.sql")),
migrations.RunSQL(get_sql("delta_0.10.0_002_add_defer_job_function.sql")),
migrations.RunSQL(get_sql("delta_0.11.0_003_add_procrastinate_periodic_defers.sql")),
migrations.RunSQL(get_sql("delta_0.12.0_001_add_foreign_key_index.sql")),
]
Maybe it's related to my postgresql installation, though it's a pretty basic postgres:11
docker installation. I'll try piping the SQL files directly to postres.
So, piping the migrations directly to postgresql raise the same bug:
docker-compose -f dev.yml exec -T postgres psql -U postgres -P < ../../procrastinate/procrastinate/sql/migrations/delta_0.11.0_003_add_procrastinate_periodic_defers.sql
CREATE TABLE
ERROR: syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_job;
^
CREATE FUNCTION
ERROR: syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
^
CREATE FUNCTION
The error is still there, but because I piped a whole file, it doesn't crash immediatly.
As far as I know, it's because the DROP FUNCTION
statement requires you to include the arguments of the function, like this: DROP FUNCTION sqrt(integer);
It's probably failing in other setups but silently ignored.
Hm, nice find, thanks !
Hm, the doc says:
If the function name is unique in its schema, it can be referred to without an argument list:
DROP FUNCTION update_employee_salaries;
So, I still don't understand how this is a syntax error.
On my machine:
$ psql
psql (12.3, server 10.10 (Debian 10.10-1.pgdg90+1))
Type "help" for help.
procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_job;
DROP FUNCTION
procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_job;
NOTICE: function procrastinate_defer_job() does not exist, skipping
DROP FUNCTION
procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
DROP FUNCTION
I thought it may be related to an inconsistent state in my setup, so I'v tried with a clean postgres 9.6 container. Launch a container:
docker run -e POSTGRES_HOST_AUTH_METHOD=trust --name test-procrastinate postgres:9.6 postgres -c log_min_duration_statement=0
Apply the migrations:
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/baseline-0.5.0.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_001_drop_started_at_column.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_002_drop_started_at_column.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_003_drop_procrastinate_version_table.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.6.0_001_fix_procrastinate_fetch_job.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.7.1_001_fix_trigger_status_events_insert.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.8.1_001_add_queueing_lock_column.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.10.0_001_close_fetch_job_race_condition.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.10.0_002_add_defer_job_function.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.11.0_003_add_procrastinate_periodic_defers.sql
docker exec -i test-procrastinate psql -Upostgres < ../../procrastinate/procrastinate/sql/migrations/delta_0.12.0_001_add_foreign_key_index.sql
Which gives the following output:
CREATE FUNCTION
ALTER TABLE
DROP TABLE
CREATE FUNCTION
DROP TRIGGER
CREATE TRIGGER
ALTER TABLE
CREATE INDEX
DROP TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TABLE
ERROR: syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_job;
^
CREATE FUNCTION
ERROR: syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
^
CREATE FUNCTION
CREATE INDEX
Now, trying with a postgres 10 container:
NOTICE: extension "plpgsql" already exists, skipping
CREATE EXTENSION
CREATE TABLE
CREATE TYPE
CREATE TYPE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE INDEX
CREATE TRIGGER
CREATE TRIGGER
CREATE TRIGGER
CREATE TRIGGER
CREATE INDEX
CREATE FUNCTION
ALTER TABLE
DROP TABLE
CREATE FUNCTION
DROP TRIGGER
CREATE TRIGGER
ALTER TABLE
CREATE INDEX
DROP TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TABLE
DROP FUNCTION
CREATE FUNCTION
DROP FUNCTION
NOTICE: function procrastinate_defer_periodic_job() does not exist, skipping
CREATE FUNCTION
CREATE INDEX
So we can reasonnably say this is related to my Postgres version, and the behaviour change starting in Postgres 10.
Now, back to the initial discussion, regarding the opportunity to package procrastinate as a django app, would you accept a contribution based on https://github.com/peopledoc/procrastinate/issues/280#issuecomment-660498268?
would you accept a contribution based on #280 (comment)?
Yes! Definitely! As I said:
I'm perfectly fine adding a Django submodule for simplifying the usage of procrastinate with django, if it's deemed preferable to a django-procrastinate dedicated package.
Also:
Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration).
And I stand by this if it's not too hard to do. This will make it much easier when we issue releases in the future: people will update procrastinate, then manage.py migrate
and it should "just work".
Oh, also, we might try to explicit supported PG versions, and test them in the CI. In theory, this would be a good thing. In practice, I'm afraid testing will make the CI much much longer, and much more complicated. I'll create a ticket. (EDIT: https://github.com/peopledoc/procrastinate/issues/282)
If it's not possible to do that cleanly within Django (and reading the code, I'm doubting it is), it could also be acceptable to generate migration files in setup.py (so they'd end up in the release) and have them be gitignored.
I'd probably hardcode the migrations, because generating migrations on the fly during release comes with two potential pitfalls:
Would it be acceptable to add a manual step when a new migration is created, basically copy-pasting 10 lines of code in the django app as a new migration?
I understand those points. I'll just note:
It's really hard to test that the migrations are actually working if files aren't committed to the repo
Hm, not sure about that. If migrations are generated every time we launch setup.py
, then they will be OK in the CI, and then can be regenerated in the dev env by re-launching pip install -e
, same as when a metadata value changes.
If the migration-generating code changes slightly, we could face hard-to-debug inconsistencies
With the current implementation, it's the same: if the SQL migration code changes, then the django migration will change. And I can't really see what we may change in the Django migration part if the only thing it does is a RunSQL. But your approach is leaning on the safe side, and I can't blame that.
That being said, I'm definitely not against your proposed changes. I'd rather we solve problems and iterate. We'll see how it goes.
Hm, not sure about that. If migrations are generated every time we launch setup.py, then they will be OK in the CI, and then can be regenerated in the dev env by re-launching pip install -e, same as when a metadata value changes.
Oh, I didn't thought about that. I'm not sure how to trigger a command when pip install -e .
is called though, but I can definitely write something to generate the migration files once we figure that out!
@ewjoachim I've made an attempt at generating migrations on the fly (see my last commit) when running pip install -e .
. I had to tweak the order, since the default order of SQL files isn't correct, and it work. However, it doesn't work with pip install .
(without the -e
flag), I need to find out why
Continuing the discussion in the PR !
This isn't a bug or feature request but rather a discussion to understand the best way to integrate
procrastinate
within a Django project, in particular, at the SQL level, as Django as its own migration system.My first attempt was to use raw SQL and execute each
sql/migrations/delta*
delta file, like this:However, this crashes, with the following output:
Execution of
delta_0.11.0_003_add_procrastinate_periodic_defers.sql
fails with a syntax error, which I don't understand unfortunately. If you have some ideas, I can definitely try these, and possibly document Django integration when I have a working setup :)