Open mgibowski opened 7 months ago
It would be nice if people could post time mix test
before and after the steps above. @mgibowski feel free to post yours, I will tweet this PR and get some attention. Feel free to share on places like ElixirForum too.
# Before adding the config to test.exs
mix test 10.26s user 3.88s system 352% cpu 4.007 total
mix test 9.94s user 3.61s system 381% cpu 3.551 total
mix test 10.07s user 3.66s system 382% cpu 3.592 total
# After adding the config to text.exs
mix test 9.99s user 3.82s system 391% cpu 3.524 total
mix test 10.05s user 3.67s system 386% cpu 3.549 total
mix test 9.93s user 3.66s system 384% cpu 3.533 total
mix test 10.01s user 3.65s system 388% cpu 3.519 total
# After reverting back to original
mix test 9.97s user 3.64s system 382% cpu 3.560 total
mix test 9.99s user 3.62s system 374% cpu 3.635 total
mix test 10.14s user 3.74s system 398% cpu 3.485 total
All test runs logged this. Some had 2.5s
instead of 2.4s
.
Finished in 2.4 seconds (2.4s async, 0.00s sync)
582 tests, 0 failures, 1 excluded
My project is maybe too small to properly measure these effects. It has 11 tables, 2 of which are oban_jobs
and oban_peers
.
It would be nice if people could post
time mix test
before and after the steps above. @mgibowski feel free to post yours, I will tweet this PR and get some attention. Feel free to share on places like ElixirForum too.
I tried using unlogged tables some time ago on a project I no longer have access to. It had around 400 tests, and the improvement seemed to be about 10% / 1.5 seconds. I tried this branch yesterday on changelog.org and the hexpm codebases, and I cannot tell there is any improvement. Test execution times are sometimes faster and sometimes slower.
This wont work for those of us who use mix ecto.dump
and mix ecto.load
. I really wish it could. I was able to run it locally from zero (ran migrations from scratch) but some of our tests fail (some tables were created outside of our application's lifecycle) so it's not apples to apples for run time.
Before
Finished in 80.4 seconds (64.5s async, 15.9s sync)
32 doctests, 53 properties, 9189 tests, 0 failures, 3 skipped
Randomized with seed 804265
[os_mon] memory supervisor port (memsup): Erlang has closed
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
________________________________________________________
Executed in 82.29 secs fish external
usr time 478.11 secs 493.00 micros 478.11 secs
sys time 115.22 secs 78.00 micros 115.22 secs
After
Finished in 83.7 seconds (66.8s async, 16.8s sync)
32 doctests, 53 properties, 9189 tests, 32 failures, 3 skipped
Randomized with seed 129402
[os_mon] memory supervisor port (memsup): Erlang has closed
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
________________________________________________________
Executed in 85.57 secs fish external
usr time 500.81 secs 65.29 millis 500.74 secs
sys time 121.15 secs 16.20 millis 121.14 secs
Here are the results on Felt's codebase, running on my M1 MacBook Air. I've included our tests tagged :slow
, which we normally only run in CI, since those are the most database intensive:
Condition | Total runtime (reported by mix ) |
Seconds async | Seconds sync | Real | User | Sys |
---|---|---|---|---|---|---|
Before | 134.1 | 58.8 | 75.3 | 2m16.029s | 2m16.029s | 0m40.376s |
Before | 125.9 | 50.6 | 75.2 | 2m7.883s | 3m9.408s | 0m37.810s |
Before | 144.6 | 54.6 | 89.9 | 2m26.629s | 3m11.503s | 0m38.874s |
After | 128.4 | 51.4 | 77.0 | 2m10.544s | 3m9.499s | 0m36.942s |
After | 125.3 | 51.3 | 73.9 | 2m7.377s | 3m10.357s | 0m39.277s |
After | 124.6 | 49.6 | 74.9 | 2m6.678s | 3m11.164s | 0m37.738s |
Mean total runtime: 134.9 before, 125.9 seconds after, for a speedup in the range of 7%.
If there's interest, I'd be happy to run these overnight to get something closer to a statistically significant sample. (The variance on the "before" number in particular is a little weird to me.) I could also run it on an Intel Mac.
UPDATE: Here are the results of running time
over 10 consecutive iterations, both with and without our :slow
(DB-intensive) tests:
Condition | Real | User | Sys |
---|---|---|---|
Before, excluding :slow |
10m24.750s | 24m55.157s | 5m41.607s |
Before, including :slow |
24m32.712s | 36m15.315s | 7m8.092s |
After, excluding :slow |
10m19.023s | 25m20.022s | 5m41.499s |
After, including :slow |
23m24.205s | 36m9.770s | 6m55.615s |
Thus, there was very little difference on our "fast" tests, and a savings of about 5% on the full suite.
for me, tldr, ~8% faster async runs, ~3% slower sync runs, but overall 2% improvement if you are looking at the clock (real time).
Setup: MIX_ENV=test mix do deps.get, compile, ecto.drop, ecto.create, ecto.load, ecto.migrate
Test: repeat 5 { time mix test }
(in retrospect, specifying the test seed would net a more consistent representation)
In order to do this, since I'm using ecto.load
and squashed migrations in the structure.sql
, I had to replace all its CREATE TABLE
statements with CREATE UNLOGGED TABLE
, otherwise I ran into a postgres error constraints on permanent tables may reference only permanent tables
(unlogged tables are not considered "permanent" tables). This would diverge from the actual structure in deployed environments, but perhaps that would be ok; may make some folks uncomfortable.
As mentioned in some previous comments, getting
** (Postgrex.Error) ERROR 42P16 (invalid_table_definition) constraints on permanent tables may reference only permanent tables
(ecto_sql 3.11.1) lib/ecto/adapters/sql.ex:1057: Ecto.Adapters.SQL.raise_sql_call_error/1
(elixir 1.15.7) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(ecto_sql 3.11.1) lib/ecto/adapters/sql.ex:1164: Ecto.Adapters.SQL.execute_ddl/4
(ecto_sql 3.11.1) lib/ecto/migration/runner.ex:348: Ecto.Migration.Runner.log_and_execute_ddl/3
(elixir 1.15.7) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(ecto_sql 3.11.1) lib/ecto/migration/runner.ex:311: Ecto.Migration.Runner.perform_operation/3
(stdlib 5.1.1) timer.erl:270: :timer.tc/2
(ecto_sql 3.11.1) lib/ecto/migration/runner.ex:25: Ecto.Migration.Runner.run/8
I didn’t have the time to run the test multiple times, so I executed it one time before and one time after the change.
I have 7951 tests, many of which are LiveView tests.
From 362 to 353, about a 2.3% improvement in the value informed by ExUnit.
Before:
Finished in 362.2 seconds (356.1s async, 6.1s sync)
________________________________________________________
Executed in 363.80 secs fish external
usr time 505.12 secs 0.19 millis 505.12 secs
sys time 85.99 secs 1.44 millis 85.99 secs
After:
Finished in 353.8 seconds (347.8s async, 6.0s sync)
________________________________________________________
Executed in 355.49 secs fish external
usr time 500.29 secs 0.16 millis 500.29 secs
sys time 86.22 secs 1.26 millis 86.22 secs
This seems useful to me even if it makes tests slower. It is a feature of Postres so they had something in mind when they created it. And other people might discover their own uses outside of that. But like Jose said it's better if it's generalized to a string because all these modifiers exist:
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE
edit: oh I see you want to make a repo configuration too. then the test performance makes sense
I agree we should support it as an option and as a string. I am not convinced on the repo configuration, given there is no meaningful gain, but perhaps we could introduce an option called :migration_table_options
, which is more generic.
If I understand correctly, this would mean adding a new option to Ecto.Migration.table/2
, right?
In that case, how should it be called?
For the reference - in Postgres docs it appears as Parameters, however Mysql and SQLlite don't give any name to that particular part of the create statement, they just call those options "keywords". Example from Sqllite:
If the "TEMP" or "TEMPORARY" keyword occurs between the "CREATE" and "TABLE" then the new table is created in the temp database.
The current options in Ecto.Migration.table/2
are:
* `:primary_key` - when `false`, a primary key field is not generated on table
creation. Alternatively, a keyword list in the same style of the
`:migration_primary_key` repository configuration can be supplied
to control the generation of the primary key field. The keyword list
must include `:name` and `:type`. See `add/3` for further options.
* `:engine` - customizes the table storage for supported databases. For MySQL,
the default is InnoDB.
* `:prefix` - the prefix for the table. This prefix will automatically be used
for all constraints and references defined for this table unless explicitly
overridden in said constraints/references.
* `:comment` - adds a comment to the table.
* `:options` - provide custom options that will be appended after the generated
statement. For example, "WITH", "INHERITS", or "ON COMMIT" clauses. "PARTITION BY"
can be provided for databases that support table partitioning.
We already have :options
... How about calling it :parameters
like in the Postgres docs?
* `:parameters` - provide custom options that will be appended after the "CREATE" and before the "TABLE"
statement. For example, "TEMPORARY", "UNLOGGED", "GLOBAL", or "LOCAL".
What do you think?
:parameter
is fine by me. :) Thanks for investigating!
I'm curious to hear from folks about the impact and whether it's worth adding such a feature at all.
If you want to try it on your codebase, follow these steps:
1) Update your dependencies to:
2) In the
test.exs
file, add the following configuration:3) And reset your database in the test environment: