Concurrent sandbox support with begin concurrent

josevalim commented 1 year ago

Hi everyone,

SQLite3 does not currently support async tests in the sandbox. At first, it makes sense because it is single writer, but then I realized that we never commit transactions anyway so we should never be writing. That's when I found SQLite3 supports a feature called "BEGIN CONCURRENT", which allows concurrent transactions, and serializes only the commit operation, which we never do!

Therefore, would it be worth giving a try to support this? Perhaps there could be a config on this project that chooses the transaction type (concurrent or not) and we set it to concurrent with the sandbox.

WDYT?

warmwaffles commented 1 year ago

I wonder what the drawback is for using BEGIN CONCURRENT all the time is.

josevalim commented 1 year ago

A concurrent transaction can fail to commit as a whole, so I think outside of concurrent tests, it should certainly be opt-in behavior.

ruslandoga commented 1 year ago

I think BEGIN CONCURRENT requires a special build from begin-concurrent branch

This branch contains the "BEGIN CONCURRENT" patch. Trunk changes are periodically merged directly into this branch.

In the stock version it fails during parsing:

sqlite> BEGIN CONCURRENT;
Parse error: near "CONCURRENT": syntax error
  BEGIN CONCURRENT;
        ^--- error here

warmwaffles commented 1 year ago

It's a shame that BEGIN CONCURRENT isn't implied because the WAL mode is enabled signalling that it would be a potentially concurrent database.

ruslandoga commented 1 year ago

However, it seems like that custom build + sandbox work :)

josevalim commented 1 year ago

Given this project ships with sqlite3, maybe we can ship with the begin concurrent version?

ruslandoga commented 1 year ago

Personally, I'd prefer running the official releases. Also begin-concurrent isn't merged with all released versions, e.g. it doesn't have a commit corresponding to 3.40.1.

~~Maybe we'd get BEGIN CONCURRENT if or once HC-tree gets more popular...~~

Is there maybe another way to enable concurrent tests?

Like starting and migrating pool_size in-memory databases with after_connect

warmwaffles commented 1 year ago

Personally, I'd prefer running the official releases.

This is generally my feelings as well. I haven't followed this feature too closely and don't know if it is going to be mainlined.

Given this project ships with sqlite3, maybe we can ship with the begin concurrent version?

I believe we'd need to change a few things in order to take that amalgamation work. It would need to be automated here and the following would cause tests to break:

kevinlang commented 1 year ago

If the begin-concurrent stuff was in trunk, I think that would definitely be the way to go, as it aligns more with how other Ecto adapters interface with the Sandbox. Unfortunately, it is not, yet - and I don't see any immediate plans to merge it :(

I like the idea of testing against an in-memory database, and I think I explored that approach at one point. However, it has some friction with how the existing Sandbox adapter works and ends up feeling hacky.

To expand, the current adapter has normal connections, but proxies those connections and does the necessary transaction wrapping statements on checkout/checkin. This allows e.g., Ecto migrations to run as expected, as this proxying is only enabled once the test_helper.exs enables manual mode.

If we follow a similar mechanism, what we want need to do in our custom Sandbox adapter is to, on checkout when manual mode is enabled, "convert" those connections into in-memory ones (via the serialization/deserialization interface), and on checkin "convert" them back into normal connections (where we get rid of the in-memory connection and re-connect to the real database), which feels a bit hacky. Hacky in that this approach modifies the actual connection itself and isn't a proxy in the true sense.

ruslandoga commented 1 year ago

Why not use the in-memory databases from the start? It seems like I didn't encounter any problems with the sandbox using that approach.

warmwaffles commented 1 year ago

I would love to use memory for the database, but with how the connections are pooled, that is impossible to do. Once the connection goes idle, it closes the database and that in memory database is lost.

It's currently being discussed here https://github.com/elixir-sqlite/exqlite/issues/192

ruslandoga commented 1 year ago

Once the connection goes idle, it closes the database and that in memory database is lost.

Even if so, on re-open, after_connect would be executed again (running the migrations) putting the database in a clean state for the next test. Note that I'm only suggesting using in memory databases for tests: https://github.com/ruslandoga/sqlite3-memory-async/compare/master...async

By idle here, what do you mean? (I think the issue link is wrong)

warmwaffles commented 1 year ago

Even if so, on re-open, after_connect would be executed again (running the migrations)

There will be people who use a structure.sql as well and don't build from scratch completely. I know this because I am one of them 😞.

By idle here, what do you mean?

I don't remember the details, but I was running into issues where the connection pool had 1 connection during a test and the in memory database just disappeared. I chalked that up to the connection going "idle" and closing the database. I threw some debugging in and saw that Exqlite.Connection.disconnect was called.

josevalim commented 1 year ago

Is begin concurrent an extension? If so, could this be used to install it? https://observablehq.com/@asg017/making-sqlite-extensions-npm-installable-and-deno

ruslandoga commented 1 year ago

It doesn't seem to be an extension, but rather a fork.

elixir-sqlite / ecto_sqlite3

Concurrent sandbox support with begin concurrent #99