elixir-ecto / ecto_sql

SQL-based adapters for Ecto and database migrations
https://hexdocs.pm/ecto_sql
Apache License 2.0
571 stars 312 forks source link

Migration `execute` behavior possibly undocumented #588

Closed spiffytech closed 9 months ago

spiffytech commented 9 months ago

Elixir version

1.15.7

Database and Version

SQLite

Ecto Versions

3.11.1

Database Adapter and Versions (postgrex, myxql, etc)

ecto_sqlite3 0.15.0

Current behavior

In migrations, I see calls to execute run after calls to query!, regardless of the order of the two appear in the migration file.

Here's a simple migration file that reproduces the issue:

defmodule EctoTest.Repo.Migrations.TestQueryOrder do
  use Ecto.Migration

  def change do
    execute "select 1"
    Ecto.Adapters.SQL.query!(EctoTest.Repo, "select 2", [])
  end

  def down do
  end
end

Output:

19:38:03.740 [info] == Running 20240118185606 EctoTest.Repo.Migrations.TestQueryOrder.change/0 forward

19:38:03.742 [debug] QUERY OK db=0.3ms
select 2 []

19:38:03.743 [info] execute "select 1"

19:38:03.744 [info] == Migrated 20240118185606 in 0.0s

You can see that select 2 gets run before of select 1. This is unexpected, and dangerous if one isn't expecting it. Adding flush() results in the expected execution order.

I don't know whether the behavior is intentional.

Seems like it's at least a gap in the docs; I didn't find anything to help me understand the execution order.

Expected behavior

I would expect all queries in a migrations file to be executed in order, line-by-line.

Otherwise, I would expect a prominent warning in the docs (especially near execute/1) describing the actual behavior.

If the behavior is intentional, and if someone can help me understand it, I can take a crack at submitting a docs PR.

greg-rychlewski commented 9 months ago

Hi,

I believe the documentation in your link explains this exact use case and provides an example. Could you maybe explain a bit what you feel is missing?

Supplying an anonymous function does allow for arbitrary code to execute as part of the migration. This is most often used in combination with repo/0 by library authors who want to create high-level migration helpers.

execute(fn -> repo().query!("SELECT $1::integer + $2", [40, 2], [log: :info]) end)

greg-rychlewski commented 9 months ago

There is also this section of the docs: https://hexdocs.pm/ecto_sql/Ecto.Migration.html#module-executing-and-flushing

Instructions inside of migrations are not executed immediately. Instead they are performed after the relevant up, change, or down callback terminates.

However, in some situations you may want to guarantee that all of the previous steps have been executed before continuing. This is useful when you need to apply a set of changes to the table before continuing with the migration. This can be done with flush/0:

However flush/0 will raise if it would be called from change function when doing a rollback. To avoid that we recommend to use execute/2 with anonymous functions instead. For more information and example usage please take a look at execute/2 function.

So basically anything that is a migration command will be executed after your function terminates. But all the other code is called normally. Since you didn't wrap your query in execute it was ran immediately.

If you feel the combination of this explanation plus the explanations in the execute documentation you linked aren't sufficient, please let us know where you feel they are lacking. Thank you.

spiffytech commented 9 months ago

I think my point of confusion is the docs don't spell out what "not executed immediately" is applied to. I interpreted it as "ANY SQL stuff inside the migration waits to execute until up/down/change return". It's not apparent to me that some SQL things wait, while others run immediately.

Perhaps I missed it in the docs, but I don't see anything suggesting execute and repo().query would be scheduled differently.

I take it this is expected? I'd be happy to PR a docs update. Is repo().query the only thing that's executed immediately?

greg-rychlewski commented 9 months ago

Oh I see your point of confusion. So it is not about SQL or not SQL, it is about whether you are using the migration functions or not. The ones here: https://hexdocs.pm/ecto_sql/Ecto.Migration.html#functions.

You can think of it this way. Those functions don't run the SQL immediately. They queue them up to be run whenever flush is called. And after up/down/change terminate, Ecto calls flush immediately.

So any function that is not a migration function is executed immediately. Whether it is dealing with Repos, SQL or anything else.

greg-rychlewski commented 9 months ago

If you think there is a way to clarify that in the docs, a PR is welcome. However we don't want to get too in the weeds. So keeping it sufficiently high level while improving understanding of how to use the functionality is important.

spiffytech commented 9 months ago

I've submitted PR #591 if you want to have a look.