Closed dhedlund closed 10 months ago
Looks like someone added a note about this to the ecto_sql
docs for Ecto.Migration.modify/3
:
If you want to modify a column without changing its type, such as adding or dropping a null constraints, consider using the execute/2 command with the relevant SQL command instead of modify/3, if supported by your database. This may avoid redundant type updates and be more efficient, as an unnecessary type update can lock the table, even if the type actually doesn't change.
The following Google Group thread from 2020 has more info about why the ecto-sql team hasn't added support for modify/2
:
https://groups.google.com/g/elixir-ecto/c/rBwdS0bXl4U/m/r2Ohs5eEBgAJ
For the recipe to set NON NULL on an existing column, I think there's a bug with one of the statements that will cause locking for extended periods of time in PG12. I have no tried to reproduce on any newer versions of postgres.
When it gets to the following step in the docs:
Ecto generates the following SQL under the hood:
Even though the type is the same as the existing column type, which should in itself be a no-op or metadata-only change, I believe it's causing postgres to not think that it can optimize the ALTER; it sees that additional work might be needed to be done at the same time and decides not to take the constraint-check shortcut route, thus triggering the table scan again while locked. At least that was my experience on PG 12.14.
Until ecto is smart enough to know that the column type is the same, and strips out the
ALTER COLUMN ... TYPE ...
, it might be dangerous to run an ecto-generated ALTER TABLE to perform this step vs. handcrafted SQL. Even then, you'd have to be on a certain version of ecto.Steps to Reproduce
Assuming a table called "foo" with a lot of data and a column called "a" of type
text
with a UNIQUE constraint:(I'm using 50 million rows, but locally on a fast NVMe drive w/ enough RAM to fully cache the data set, so single-digit second responses will be larger on an active production environment)
I'm running PG 12.2 for these timings, but we also experienced the issue on PG 12.14.
Creating and validating the constraint. Note the 3.3 seconds to do a full table scan:
Converting the column to non-null using SQL produced by
mix ecto.migrate --log-migrations-sql
. Note the inclusion ofALTER COLUMN "a" TYPE text
and the time it takes to run being similar to the full table scan:Setting the
NOT NULL
without the type. Note response times in the single-digit milliseconds:Example Table Setup
The following table setup was used to reproduce the issue: