elixir-ecto / ecto_sql

SQL-based adapters for Ecto and database migrations
https://hexdocs.pm/ecto_sql
Apache License 2.0
578 stars 312 forks source link

Migration insert in schema_migration failed: wrong_type #638

Closed jzaehrin closed 1 month ago

jzaehrin commented 1 month ago

Elixir version

Elixir 1.15.4 (compiled with Erlang/OTP 26)

Database and Version

SQLITE 3.46.1 from exqlite changelog

Ecto Versions

ecto 3.12.4

Database Adapter and Versions (postgrex, myxql, etc)

exqlite 0.24.2

Current behavior

Unable to migrate under Elixir 1.15.4, I got the following error

08:15:12.801 [info] == Migrated 20221005124637 in 0.0s
** (Protocol.UndefinedError) protocol String.Chars not implemented for {:wrong_type, 20221005124637} of type Tuple. This protocol is implemented for the following type(s): Atom, BitString, Date, DateTime, Decimal, EasyVault.NiX1Client.Query, Exqlite.Query, Float, Hex.Solver.Assignment, Hex.Solver.Constraints.Empty, Hex.Solver.Constraints.Range, Hex.Solver.Constraints.Union, Hex.Solver.Incompatibility, Hex.Solver.PackageRange, Hex.Solver.Term, Integer, List, NaiveDateTime, Phoenix.LiveComponent.CID, Que.Job, Time, URI, Version, Version.Requirement
    (elixir 1.15.4) lib/string/chars.ex:3: String.Chars.impl_for!/1
    (elixir 1.15.4) lib/string/chars.ex:22: String.Chars.to_string/1
    (exqlite 0.24.2) lib/exqlite/connection.ex:660: Exqlite.Connection.bind_params/3
    (exqlite 0.24.2) lib/exqlite/connection.ex:619: Exqlite.Connection.execute/4
    (db_connection 2.7.0) lib/db_connection/holder.ex:354: DBConnection.Holder.holder_apply/4
    (db_connection 2.7.0) lib/db_connection.ex:1558: DBConnection.run_execute/5
    (db_connection 2.7.0) lib/db_connection.ex:829: DBConnection.execute/4
    (ecto_sqlite3 0.17.2) lib/ecto/adapters/sqlite3/connection.ex:90: Ecto.Adapters.SQLite3.Connection.query/4

The error seems to occur when inserting the version into the schema_migrations table. the version field appears to be refused for wrong type with the value 20221005124637 in bind_params. If I change the migration identifier to 20221005, the insert works. When I try to manually insert the row in schema_migrations, ecto.migrations shows up 304093936 ** FILE NOT FOUND ** instead of 20221010123504 (the next migration).

sqlite> select * from schema_migrations;
20221005|2024-10-07T06:42:59
20221010123504|2024-10-07T06:42:59

The SQLITE version of the system is a bit old (v3.40.1) on ARM Cortex-A72, I don't know what effect it has in this case.

Expected behavior

The migration should be correctly insert inside schema_migrations.

josevalim commented 1 month ago

/cc @warmwaffles

Btw, @warmwaffles, is it ok if we invite you to the Ecto organization so it is easier to tag/ping you in exqlite issues? There are no expectations or responsibilities.

warmwaffles commented 1 month ago

@josevalim I'm okay with that.

There are no expectations or responsibilities.

My expectation is that the library I built works πŸ˜› and don't like it when it doesn't.

warmwaffles commented 1 month ago

Ooof.

@jzaehrin do you have strict tables turned on?

Can you reproduce this in a bare project? I am assuming this is coming from a private repo.

jzaehrin commented 1 month ago

Hello @warmwaffles ,

Yes, it is from a private repo. I will try create a bare project as soon as possible.

No the strict tables is not turned on:

sqlite> pragma table_list;
schema|name|type|ncol|wr|strict
main|schema_migrations|table|2|0|0

When I am trying the same code on my personal machine(x86) with Elixir 1.17.2 (compiled with Erlang/OTP 26), I cannot reproduce the issue.

warmwaffles commented 1 month ago

Leaving this here for later, but I think :wong_type is being emitted from here:

https://github.com/elixir-sqlite/exqlite/blob/173b494630fafb8ad3e1409b8fbc2d3d35b2612e/c_src/sqlite3_nif.c#L505-L514

This is called when the parameters are binded to the SQL statement.

jzaehrin commented 1 month ago

I simply create a Phoenix project from scratch with mix (mix phx.new --database sqlite3 sqlite_issue). I add the following migration:

defmodule SqliteIssue.Repo.Migrations.CreateUserTable do
  use Ecto.Migration

  def change do
    create table(:users) do
      add(:username, :string, null: false, collate: :nocase)
      add(:firstname, :string, null: true, collate: :nocase)
      add(:lastname, :string, null: true, collate: :nocase)
      add(:hashed_password, :string, null: false)
      timestamps()
    end

    create(unique_index(:users, [:username]))
  end
end

And I got the same issue:

12:42:21.007 [error] Exqlite.Connection (#PID<0.199.0>) failed to connect: ** (Exqlite.Error) database is locked

12:42:21.122 [info] == Running 20241007140221 SqliteIssue.Repo.Migrations.CreateUserTable.change/0 forward

12:42:21.123 [info] create table users

12:42:21.124 [info] create index users_username_index

12:42:21.127 [info] == Migrated 20241007140221 in 0.0s
** (Protocol.UndefinedError) protocol String.Chars not implemented for {:wrong_type, 20241007140221} of type Tuple. This protocol is implemented for the following type(s): Atom, BitString, Date, DateTime, Decimal, Exqlite.Query, Float, Integer, List, NaiveDateTime, Phoenix.LiveComponent.CID, Time, URI, Version, Version.Requirement
    (elixir 1.15.4) lib/string/chars.ex:3: String.Chars.impl_for!/1
    (elixir 1.15.4) lib/string/chars.ex:22: String.Chars.to_string/1
    (exqlite 0.24.2) lib/exqlite/connection.ex:660: Exqlite.Connection.bind_params/3
    (exqlite 0.24.2) lib/exqlite/connection.ex:619: Exqlite.Connection.execute/4
    (db_connection 2.7.0) lib/db_connection/holder.ex:354: DBConnection.Holder.holder_apply/4
    (db_connection 2.7.0) lib/db_connection.ex:1558: DBConnection.run_execute/5
    (db_connection 2.7.0) lib/db_connection.ex:829: DBConnection.execute/4
    (ecto_sqlite3 0.17.2) lib/ecto/adapters/sqlite3/connection.ex:90: Ecto.Adapters.SQLite3.Connection.query/4

The error seems to me to have a lot to do with the context under ARM.

warmwaffles commented 1 month ago

I believe this is something to do with the fact that 20241007140221 can not fit into a standard 32 bit integer enif_get_int. I have a solution that will make the enif_get_int64 get called before the enif_get_int, but almost all integers will be parsed into the 64 bit integer.

What makes this more difficult is that I do not have easy access to an ARM based machine. I'm working day to day on an x86-64 machine.

If you could change the dependency in your project to

{:exqlite, git: "https://github.com/elixir-sqlite/exqlite.git", branch: "attempt-to-fix-arm", override: true}

You will need to make sure you have a C compiler installed and it should compile.

The main issue is that the variable fails to bind to the query, and {:error, {:wrong_type, 20241007140221}} is returned which DBConnection does not know what to do with. @josevalim in a situation like this, would it be better to raise an exception that value provided could not be bound? I can make a change to call enif_raise_exception from the nif, but I haven't done that before. Seems pretty straight forward.

josevalim commented 1 month ago

Yes, exactly. I believe the other adapters also raise when given an invalid value for a type.

jzaehrin commented 1 month ago

The fix totally works as expected !

user@pi:~/git/sqlite_issue $ mix ecto.migrate
05:17:19.312 [info] == Running 20241007140221 SqliteIssue.Repo.Migrations.CreateUserTable.change/0 forward

05:17:19.317 [info] create table users

05:17:19.319 [info] create index users_username_index

05:17:19.323 [info] == Migrated 20241007140221 in 0.0s
user@pi:~/git/sqlite_issue $ sqlite3 sqlite_issue_dev.db

The result

SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> select * from schema_migrations;
20241007140221|2024-10-08T03:17:19
sqlite>

Thank you for your quick work!

warmwaffles commented 1 month ago

Okay that's great news, also disappointing. I was hoping to check the smaller integer and then if necessary use the larger integer.

I'll also change the interface a little bit to raise an ArgumentError when the type is wrong.

warmwaffles commented 1 month ago

Slight change, but I've got a PR up here https://github.com/elixir-sqlite/exqlite/pull/297

jzaehrin commented 1 month ago

Thank you! Should this issue be closed or wait for the next release of exqlite ?

warmwaffles commented 1 month ago

I'm cutting a release now πŸ˜‰

warmwaffles commented 1 month ago

@jzaehrin published under v0.25.0 if you encounter any issues with it, please open an issue in https://github.com/elixir-sqlite/exqlite

jzaehrin commented 1 month ago

I just tried with exqlite v0.25.0 and the behavior changed from previously, an exception is throw for the binding. Is it expected?

warmwaffles commented 1 month ago

Yes. What is the exception being raised.

jzaehrin commented 1 month ago

** (Exqlite.BindError) Failed to bind argument 20241007140221

So, somebody try to bind a int32 instead of an int64 like in x86?

warmwaffles commented 1 month ago

Worse, that is the last fall through case. It was not able to be parsed as an int nor an int64.

Can you do me a favor and try this against OTP27?

jzaehrin commented 1 month ago

Raspbian do not provide experimental packages with Erlang OTP 27...

I will try tomorrow to get packages from Debian and I come back.

ruslandoga commented 1 month ago

πŸ‘‹ @jzaehrin

Sorry to butt in, but I wonder if you are on 32-bit system by any chance? I remember Raspberry ships with 32-bit systems by default with ability to upgrade to 64-bit in. In which case it might be interesting to try using the 64-bit version.

When I try to manually insert the row in schema_migrations, ecto.migrations shows up 304093936 FILE NOT FOUND instead of 20221010123504 (the next migration).

iex> Bitwise.band(20221010123504, 0xFFFFFFFF)
304093936

It seems like the SQLite integers get truncated when they reach Exqlite. Note that this part is not a bind, but a "read" where Exqlite always uses 64 bit ints.

FWIW I wasn't able to reproduce the problem on a arm64 macbook.

jzaehrin commented 1 month ago

πŸ‘‹πŸ» @ruslandoga

I’m not totally sure how to detect it. But the kernel answer the architecture as aarch64 using uname -a

ruslandoga commented 1 month ago

Hm.

Would you be able to execute these steps in the app having these problems?

iex> {:ok, db} = Exqlite.Sqlite3.open(":memory:", [:readonly])
{:ok, #Reference<0.3459866686.391249946.208979>}
iex> {:ok, stmt} = Exqlite.Sqlite3.prepare(db, "select 20221010123504")
{:ok, #Reference<0.3459866686.391249946.208982>}
iex> Exqlite.Sqlite3.step(db, stmt)
{:row, [20221010123504]}

Also does Exqlite NIF come pre-compiled into your app? It would probably be visible from the app build logs.

jzaehrin commented 1 month ago

Exqlite comes pre-compiled. I got a warning that may be important.

user@pi:~/git/sqlite_issue $ mix compile
warning: key "exqlite-nif-2.16-aarch64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

warning: key "exqlite-nif-2.16-riscv64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

warning: key "exqlite-nif-2.16-x86_64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

warning: key "exqlite-nif-2.17-aarch64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

warning: key "exqlite-nif-2.17-riscv64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

warning: key "exqlite-nif-2.17-x86_64-linux-gnu-0.25.0.tar.gz" will be overridden in map
  nofile:1

==> exqlite
Compiling 14 files (.ex)
Generated exqlite app

I think I got the issue here, the Erlang VM looks to be compiled in 32bit and not in 64bit...

user@pi:~/git/sqlite_issue $ iex -S mix run --no-start
Erlang/OTP 26 [erts-14.0.2] [source] [32-bit] [smp:4:4] [ds:4:4:10] [async-threads:1]

Interactive Elixir (1.15.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> import Exqlite
Exqlite
iex(2)> {:ok, db} = Exqlite.Sqlite3.open(":memory:", [:readonly])
{:ok, #Reference<0.113271238.485359632.158582>}
iex(3)> {:ok, stmt} = Exqlite.Sqlite3.prepare(db, "select 20221010123504")
{:ok, #Reference<0.113271238.485359627.158098>}
iex(4)> Exqlite.Sqlite3.step(db, stmt)
{:row, [304093936]}

You were totally right, the system is indeed 32bit even though the architecture is aarch64. I was able to determine that the kernel is 32bit...

user@pi:~/git/sqlite_issue $ getconf LONG_BIT
32

I apologise for the problem with my context.

ruslandoga commented 1 month ago

I wonder if the error persist if you force compilation on your system? If it doesn't, then the problem is probably somewhere in the pre-compilation process for your architecture.

I think I got the issue here, the Erlang VM looks to be compiled in 32bit and not in 64bit...

That could be it indeed. Erlang defines nif types based on SIZEOF_LONG and co., and those types mean different sizes on different architectures. So the issue here is probably that the precompiled binary assumes SIZEOF_LONG is 8 bytes and defines ErlNifSInt64 as a long but on the actual VM a long gets treated as Int32 and ErlNifSInt64 should've been a long long.

Maybe there is an issue in choosing the pre-compiled binary to download. I'm not seeing it in the logs, but I wonder if there is a way to find out which pre-compiled binary was downloaded, maybe by lowering the log level to info?

jzaehrin commented 1 month ago

Force the compilation of Exqlite totally fixed the issue:

user@pi:~/git/sqlite_issue $ iex -S mix run --no-start

Erlang/OTP 26 [erts-14.0.2] [source] [32-bit] [smp:4:4] [ds:4:4:10] [async-threads:1]

Interactive Elixir (1.15.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> import Exqlite
Exqlite
iex(2)> {:ok, db} = Exqlite.Sqlite3.open(":memory:", [:readonly])
{:ok, #Reference<0.93007989.2102263820.177903>}
iex(3)> {:ok, stmt} = Exqlite.Sqlite3.prepare(db, "select 20221010123504")
{:ok, #Reference<0.93007989.2102263820.177906>}
iex(4)> Exqlite.Sqlite3.step(db, stmt)
{:row, [20221010123504]}
iex(5)>

Is it possible that the recompiled versions didn't exist before? As I remember, I've always seen exqlite compiled in C in the past.

ruslandoga commented 1 month ago

AFAIK pre-compilation is a recent addition.

So to sum up, it seems like the problem is with picking the pre-compiled binary to download when the VM is 32-bit but uname -a and co report aarch64.

iex> :os.type()
{:unix, :darwin}
iex> :erlang.system_info(:system_architecture)
~c"aarch64-apple-darwin23.5.0"

@jzaehrin what do these ^^^ commands return on your system?

jzaehrin commented 1 month ago

Here is

iex(6)> :os.type()
{:unix, :linux}
iex(7)> :erlang.system_info(:system_architecture)
~c"armv7l-unknown-linux-gnueabihf"
ruslandoga commented 1 month ago

Ah, no. Seems like I was wrong, :erlang.system_info(:system_architecture) reports the correct, 32-bit arch. Hm ... Then maybe the problem is with the pre-compiled arm7l binary itself. I don't have enough experience to verify that though. So I'll leave it to cc_precompiler and elixir_make maintainers :)

warmwaffles commented 1 month ago

I'll need to investigate the precompilation. I'll probably need to pester @cocoa-xu for some help.

cocoa-xu commented 1 month ago

That seems weird, I'll run it on a 32-bit RPi and do some tests/debugging!

warmwaffles commented 1 month ago

It is entirely possible I botched the precompiled binaries.

cocoa-xu commented 1 month ago

Not quite sure why, but the root cause is that ErlNapiSInt64 becomes as signed 32-bit integer type when cross-compiling for an armv7l target on an x86_64 host. Will dig into this.