Replace DirtyNIF execution model with a different mechanism

warmwaffles commented 2 years ago

The dirtynif execution model is okay for most cases, but we have an issue where long running writes need to timeout / be interrupted mid execution and aborted. The only way to do that is to utilize https://sqlite.org/c3ref/progress_handler.html

josevalim commented 1 year ago

I wonder if you may be able to stick with the process model as long as you assume each instance of the database is backed by one process. This way you make Elixir processes your thread pool, instead of having to deal with it in C/C++/Rust.

warmwaffles commented 1 year ago

@josevalim I think that's a good idea where only one process in the supervision tree sits there waiting for another DbConnection call. This could make it easier to cancel existing running queries.

Let me noodle on this a little more.

josevalim commented 1 year ago

Right, I think in this sense DBConnection may be getting more in the way than helping. Because you need serial access around SQLite3 and having a pool around serial access is not giving you anything. And when you don't need serial access (i.e. read operations), you don't need a pool at all.

DBConnection is necessary for Ecto but, even then, it should not be a requirement. There are only three features in Ecto.Adapters.SQL that require DBConnection (sandbox, stream, and disconnect_all) but they are all addressable.

Then there is the part of DBConnection as a convenience. I can think of two features it gives:

a transaction manager - if the process doing a transaction crashes, we need to abort it. but perhaps sqlite3 api itself already has conveniences for handling dangling transactions
the ownership pool

Of course there is a lot of work here, so I am not saying we should yank DBConnection, but it may be worth considering moving the DBConnection bits to ecto_sqlite3. In this project, based on the analysis above (which may be incomplete), you would only be missing the transaction manager (and it may be provided by sqlite3).

LostKobrakai commented 1 year ago

There might be an additional benefit to managing transactions in a sqlite specific way. Currently there's no information returned to changesets about which constraint was violated, but there seems to be some pragmas to figure those out as sqlite doesn't close transactions on errors.

What I'm wondering about @josevalim's suggestion is how using processes to serialize or not serialize access would work for transactions which only become write transactions mid execution. Wouldn't the suggested approach require knowing if a transaction does writes or not in advance?

ruslandoga commented 1 year ago

Wouldn't having one queuing process per db instance negate WAL benefits? There still needs to be a pool of connections for concurrent reads. It's not possible to open a single connection and start multiple transactions on it. And if we try to work around that with snapshots, we'd start losing isolation.

josevalim commented 1 year ago

@ruslandoga the reads can go directly to the NIF. Only the writes and operations inside a transaction would go through the process.

ruslandoga commented 1 year ago

But how exactly would reads go directly to the NIF? Using what database connection? It can't be the one from the "writer" process since it might be in the middle of a transaction, and then we need synchronisation of reads and writes.

Ah, I think I understand now. Since the reads are "scoped" to a prepared statement, it might be possible to reuse a single database connection for multiple statements at once. I'll try it out.

UPDATE it works: https://github.com/ruslandoga/exqlite/pull/3; and reads seemingly became faster

iex(1)> {:ok, conn} = Exqlite.start_link(database: "demo.db", debug: [:trace])
{:ok, #PID<0.245.0>}
iex(2)> Exqlite.write(conn, "create table users(name text not null) strict")
# *DBG* <0.245.0> got call checkout from <0.252.0>
# *DBG* <0.245.0> sent ok to <0.252.0>, new state {[],<0.252.0>}
# *DBG* <0.245.0> got cast checkin
{:ok, []}
# *DBG* <0.245.0> new state {[],nil}

iex(3)> Exqlite.write(conn, "insert into users(name) values(?)", ["john"])
{:ok, []}
iex(4)> Exqlite.write(conn, "insert into users(name) values(?)", ["david"])
{:ok, []}
iex(5)> Exqlite.write(conn, "insert into users(name) values(?)", ["peter"])
{:ok, []}

iex(6)> {:ok, task_sup} = Task.Supervisor.start_link
{:ok, #PID<0.254.0>}

iex(7)> Task.Supervisor.async_stream_nolink(task_sup, 1..10000, fn _ -> Exqlite.query(conn, "select * from users") end, max_concurrency: 1000) |> Enum.into([])
[
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]}
# ...

I wonder how the statements cache would work with this approach. It'd need to be serialised somehow as well.

LostKobrakai commented 1 year ago

I think the tricky thing would be supporting concurrent reads from within transactions, which didn't yet do writes: https://www.sqlite.org/lang_transaction.html#deferred_immediate_and_exclusive_transactions

Not sure how exqlite could know which statement within a transaction requires a SHARED LOCK (reads) vs one of the ones required for writing. Maybe it could naively check for SELECT … in the sql.

ruslandoga commented 1 year ago

@LostKobrakai I think all transactions would need to go through the queue. Even if a read transaction never becomes a write one, it still can't be executed without syncing with the writers first. Otherwise multiple transactions can be attempted:

BEGIN; -- from writer in process 1
INSERT INTO ...
INSERT INTO ...

                 BEGIN; -- a reader starts a transaction in process 2
                 -- SQLite complains about re-entry: "cannot start a transaction within a transaction"

COMMIT; -- writer finishes the transaction from process 1

This is actually what I was worried about in the comments above. But since reads don't have to be wrapped in transactions, single connection approach works.

It might be possible to do snapshots though, for read-only transactions without BEGIN / COMMIT, at the NIF level.

LostKobrakai commented 1 year ago

I just tried this with two table plus instances on the same database and it allowed me to start a read transaction just fine while another write transaction was started. Trying to start a write transaction I got database is locked. Though I guess that's two "connections" to sql, is it?

ruslandoga commented 1 year ago

two table plus instances

That means two connections. I think the approach discussed in this issue is having only a single connection open to the database.

$ sqlite3 demo.db

SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> begin;
sqlite> begin;
Runtime error: cannot start a transaction within a transaction

LostKobrakai commented 1 year ago

Given DBConnection is an abstraction over connections I'm not so sure. I'd for sure appreciate if it actually could do concurrent reads (in a transaction) given how much code in a business application requires to be run in a transaction, which usually start reading a bunch of things before they write.

ruslandoga commented 1 year ago

I think there is danger of those transactions failing with SQLITE_BUSY so serialising all transactions seems like a safer way.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select * from users;
sqlite> delete from users where name = 'patrick';
                                                     sqlite> insert into users(name) values('jacob');
                                                     Runtime error: database is locked (5)

If it's about performance, we can try coming up with some benchmarks to see at what write/read ratio a single connection approach wins / loses. My hot take is that (read)write transactions are rare and reads are common and having un-pooled reads would end up being faster on average. Better memory usage too, probably.

LostKobrakai commented 1 year ago

That's why I was wondering if exqlite could catch writes before they queue for "BUSY WAIT", but the queuing happens in elixir land. In the end this wouldn't be much different to e.g. a timeout due to a to long running query with e.g. postgres/ecto. Another option would be possible if exqlite can timeout a transaction from the outside and setting busy_timeout to a value larger than elixirs internal timeout.

ruslandoga commented 1 year ago

I don't understand. How would busy_timeout help in the above example? No matter how long we wait, if we started a deferred transaction before some other connection made a write to a table we read from, we won't be able to commit it even if the other connection's writer is finished.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select * from users;
sqlite> delete from users where name = 'patrick';
sqlite> commit;
                                                     sqlite> insert into users(name) values('jacob');
                                                     Runtime error: database is locked (5)

We can however remove all queueing in Elixir and rely on what SQLite returns (error code or ok), but that seems more difficult to develop for. One would need to know and understand SQLite specifics. ~~And we probably wouldn't be able to rollback exiting writers~~ or provide :checkout_timeout per query since pragma busy_timeout is global (I think).

I think busy_timeout would make concurrent read+write begin immediate; transactions work by queuing them, but that'd disable concurrent reads as well.

LostKobrakai commented 1 year ago

That's unexpected, I would've expected this to work based on the documentation on the topic. If this is the case I'd wonder why have busy_timeout in the first place, if waiting makes no sense.

ruslandoga commented 1 year ago

What if we try updating a record we read but that since has been deleted? I don't think it can work at all.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select rowid from users where name = 'john';
sqlite> delete from users where name = 'john';
sqlite> commit;
                                                     sqlite> update users set name = 'john jr.' where rowid = 1;
                                                     Runtime error: database is locked (5)

LostKobrakai commented 1 year ago

It there are conflicting changes I think most databases will roll back one transaction. Though deleting one row and adding another shouldn't really be conflicting, but that might just be sqlite.

ruslandoga commented 1 year ago

I'm back to thinking that a single connection with un-pooled reads wouldn't work. If we issue reads directly to the NIF while a writer is in a transaction, we lose isolation guarantees as we can read data from unfinished transaction.

iex> spawn(fn ->
  Exqlite.transaction(db, fn conn ->
    Exqlite.query(conn, "insert into users(name) values(?)", ["new"])
    :timer.sleep(:timer.seconds(1000))
  end)
end)
iex> Exqlite.query(db, "select * from users")
{:ok,
 [
   ["david"],
   ["peter"],
   ["new"]
 ]}

I'm thinking now about using two connections, one for writers and one for readers. The readers connection would accept un-pooled reads. But even when un-pooled, reads would still limited by the number of dirty schedulers.

but that might just be sqlite.

Yeah, I think in the stock version SQLite "marks" whole database as dirty (or rather it compares the transaction ids when it started and at the first write) if it has been modified (outside of the current transaction) since the transaction has started. There is a begin-concurrent branch that allows marking pages as dirty (but that still doesn't help when the databases are small and fit in a few pages), and now I think a high-concurrency branch is in the works with row-level locking like in the other dbs.

ruslandoga commented 1 year ago

I'm thinking now about using two connections, one for writers and one for readers. The readers connection would accept un-pooled reads.

That didn't end up working either. Reads start an implicit transaction and that's a problem since concurrent reads can force that transaction to be indefinitely open and that connection would never get new data.

conn1 (readonly)                                            conn2 (writeonly)
prepare(stmt1)
step(stmt1) <- tx opened                                                 
                                                            insert row
step step step(stmt1) <- no row, ok
prepare(stmt2) 
step step step (stmt2) <- no row
finish(stmt1) <- tx not closed since stmt2 still stepping
prepare(stmt3) <- no row, not ok since current tx was opened for stmt1

josevalim commented 1 year ago

Sorry for staying out of this conversation and thanks for exploring @ruslandoga.

It seems like we would need the process to work as read-write lock. While reads are happening, we allow them to run concurrently. Once a write arrives, we start queueing events and draining the reads. Once they are all drained, execute the write, and then move to the next in the queue. If next is a read, we start doing concurrent work again. All queries can still run on the client. You can find an implementation of serialized reads-writes from NIF running from the client here: https://github.com/elixir-explorer/adbc/blob/main/lib/adbc_connection.ex

Having a concurrent reads version should an expansion of the API above (but hey, you can even start with a full serialized version and add the read concurrency later). API wise, you could keep "query" as a write operation but you can introduce a "read_query" where the user guarantees they are exclusively reading.

In any case, the main point is that I think this library can be cleaner and easier to maintain if it doesn't try to do NIFs, process management, and DBConnection at the same time. I think NIFs + process management would make this usable across more scenarios and DBConnection is an Ecto concern (which perhaps we can move all the way up to Ecto itself).

ruslandoga commented 1 year ago

@josevalim @warmwaffles if there are no other takers, I'm interested in exploring this approach :)

josevalim commented 1 year ago

Definitely go for it from my side - but that's @warmwaffles decision to make. :)

warmwaffles commented 1 year ago

Yea sure why not. @ruslandoga if you want to give it a shot I'm all for it. I'm really swamped with day job work at the moment, but I have time to do reviews and testing implementations.

ruslandoga commented 1 year ago

I've started a PR https://github.com/elixir-sqlite/exqlite/pull/255 for https://github.com/elixir-sqlite/exqlite/issues/192#issuecomment-1634198546 and while working on it I had one concern. With this approach we miss out on WAL allowing multiple readers when there's a writer, but I guess it could be tackled in ecto_sqlite3 by starting two connections and, for example, dedicating one for writes and the other for reads. To avoid https://github.com/elixir-sqlite/exqlite/issues/192#issuecomment-1443701807 these connections would be "cooperative" so that after a write happens, the read connection would be drained from reads to close the implicit transaction and allow for the new data to become visible.

josevalim commented 1 year ago

Before I can answer, quick question: can we perform concurrent reads over the same single connection in SQLite3? Or the reads have to be serialized?

ruslandoga commented 1 year ago

Yes, we can perform concurrent reads, the "result rows" are "scoped" per statement. So as we are stepping through two different prepared statements in the same SQLite connection from different processes, we don't get interleaved results and all is fine :)

Here's a test case that is supposed to ensure the reads are concurrent: https://github.com/ruslandoga/exqlite/blob/c07b537424a62b3eb756988e0b2a2c00effbe970/test/exqlite/rw_connection_test.exs#L119-L139

The only problem is that these statements open implicit transactions which means that unless all statements are "done" (released or reached the end of rows), no new data is visible over that connection (e.g. if it has been inserted over another connection).

josevalim commented 1 year ago

I see. So we need to answer how low-level vs high-level we want to be. SQlite3 doesn't handle any of this, so we could not. But if we want to provide high-level abstractions, then it makes sense to provide a high-level abstraction that would also be used by EctoSQLite3. So my suggestion is to have the RWLock start both connections (we could make the Read transaction be lazy in the future if that's ever an issue). query goes through the write one, read_query goes through the read one.

warmwaffles commented 1 year ago

SQLite AFAIK can handle concurrent reads, but the moment a write comes along, things get tricky. It also depends on how the database's journal_mode is set. If it is set to wal, sqlite will do its best to resolve the writes that happen concurrently.

I'm not sure how it performs when the database is opened with :memory: when trying to do concurrent writes. I'm fairly certain it'll handle that just fine since it would all be in process and be able to have the appropriate locks in process.

ruslandoga commented 1 year ago

So my suggestion is to have the RWLock start both connections (we could make the Read transaction be lazy in the future if that's ever an issue). query goes through the write one, read_query goes through the read one.

I might have misunderstood it a bit but here's a PR https://github.com/elixir-sqlite/exqlite/pull/256

Right now it opens two SQLite3 DBs in the same Elixir process, one for locked query and the other for concurrent read_query.

My concern with doing that is it seems a bit sneaky. I'm trying to come up with scenarios where it can bite the user:

:memory: databases
:mode != :wal
https://github.com/elixir-sqlite/exqlite/pull/256#issuecomment-1645026108
I feel like there might be something else, like builds or setups where reading from another connection wouldn't work well

Maybe Exqlite could provide multiple connection types:

Exqlite.ReadConnection -- concurrent reads, which are drained on GenServer.cast(conn, :drain)
Exqlite.Connection -- all commands are serialised
Exqlite.RWConnection -- similar to https://github.com/elixir-sqlite/exqlite/pull/256 but more of a proxy/supervisor that starts both of the above and also exports read_query/3 (the two above only have query/3)

This would make Exqlite.Connection a safe default and Exqlite.RWConnection a performant choice for Ecto.Adapters.SQLite3.

warmwaffles commented 1 year ago

@ruslandoga another interesting thought I had is that ideally the database is read from more than written to. Bottlenecks will be at the two genservers spun up, mainly on the reader. We could try to pool the readers and round robin the calls to each one, and maintain a single write instance. Probably something to explore later.

ruslandoga commented 1 year ago

The only thing the "reader genserver" does in "normal" state is preparing the statements. Since there is no cache, we can move that to the caller as well.

Then the bottleneck would only be able to appear during read connection draining after a locked query finishes (assuming it was a write). Just to recap, we do this to make any implicit transactions opened in that read connection to be closed by finishing/releasing all ongoing statements.

During this draining all incoming reads are stored in the queue inside genserver, but it doesn't appear to be all that different from how db_connection works, with the difference that it probably uses ets to queue the callers: the only reason for the read queue to grow are long reads started before the locked query finished, maybe during draining we can switch into codel mode... I'll try to cause this scenario in the benchmarks to see how big of an issue this is.

I'll also try creating a new read connection each time we start draining the old one instead of queuing reads.

warmwaffles commented 1 year ago

One of the main reasons I wanted to look into replacing the DirtyNIF execution model is that I would like to be able to cancel queries that overrun the timeout. Ecto would timeout requests if they took too long to run while modifying records.

One use case was a developer I was talking to was updating / cleaning up records and the query was touching thousands of records overrunning the default timeout. Although it looked like it timed out, instead it was still running in the background and causing compounding latency issues.

What my hope is that with the new implementation that you are working on @ruslandoga we are not going to timeout the connection at all, but instead keep it open until the request finishes.

ruslandoga commented 12 months ago

Right now I'm exploring the following approach:

exqlite becomes just the NIF wrapper, like https://github.com/ruslandoga/sqlite.ex, but exposes some more utility functions
ecto_sqlite3 starts two processes, a reader and a writer (for a readwrite db), with reader multiplexing select statements and waiting for transaction close on changes to db, and with writer just queueing writes and Repo.query calls

I'll try to finish it up by the end of the week and open new PRs.

ruslandoga commented 12 months ago

One of the main reasons I wanted to look into replacing the DirtyNIF execution model is that I would like to be able to cancel queries that overrun the timeout. Ecto would timeout requests if they took too long to run while modifying records.

If we go with drop-DBConnection then there wouldn't be any need for timeouts, I think. But if we still want to provide them, it can be done while keeping dirty schedulers but with a slightly different INSERT handling: instead of building a single multi-row INSERT INTO ... (?, ?, ?), (?, ?, ?), (?, ?, ?), etc.. statement we can prepare an insert for a single row and re-bind it for every row. We can do it similar to multi_step with a limited number of steps per each invocation (if the caller times out and exits, we stop invoking and rollback). I called it multi_bind_step and it seems to be ~x2 faster than the current approach (benched in an in-memory DB, to measure the "nif overhead" instead of actual performance):

mix run bench/insert_all.exs from a GitHub Actions run

Benchmark: https://github.com/ruslandoga/sqlite.ex/blob/master/bench/insert_all.exs Results are copied from https://github.com/ruslandoga/sqlite.ex/actions/runs/6361581237/job/17276103771 ``` Operating System: Linux CPU Information: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz Number of Available Cores: 2 Available memory: 6.76 GB Elixir 1.15.6 Erlang 26.1.1 Benchmark suite executing with the following configuration: warmup: 2 s time: 5 s memory time: 0 ns reduction time: 0 ns parallel: 1 inputs: 10, 100, 1000, 10000 Estimated total run time: 56 s Benchmarking Exqlite.Sqlite3 single insert with input 10 ... Benchmarking Exqlite.Sqlite3 single insert with input 100 ... Benchmarking Exqlite.Sqlite3 single insert with input 1000 ... Benchmarking Exqlite.Sqlite3 single insert with input 10000 ... Benchmarking SQLite.multi_bind_step with input 10 ... Benchmarking SQLite.multi_bind_step with input 100 ... Benchmarking SQLite.multi_bind_step with input 1000 ... Benchmarking SQLite.multi_bind_step with input 10000 ... ##### With input 10 ##### Name ips average deviation median 99th % SQLite.multi_bind_step 26.96 K 37.09 μs ±189.66% 35.20 μs 60.82 μs Exqlite.Sqlite3 single insert 11.72 K 85.34 μs ±67.27% 85.20 μs 113.40 μs Comparison: SQLite.multi_bind_step 26.96 K Exqlite.Sqlite3 single insert 11.72 K - 2.30x slower +48.25 μs ##### With input 100 ##### Name ips average deviation median 99th % SQLite.multi_bind_step 6.84 K 146.22 μs ±38.67% 141.50 μs 301.60 μs Exqlite.Sqlite3 single insert 3.05 K 327.79 μs ±48.37% 301.80 μs 967.90 μs Comparison: SQLite.multi_bind_step 6.84 K Exqlite.Sqlite3 single insert 3.05 K - 2.24x slower +181.57 μs ##### With input 1000 ##### Name ips average deviation median 99th % SQLite.multi_bind_step 807.23 1.24 ms ±9.70% 1.22 ms 1.55 ms Exqlite.Sqlite3 single insert 411.44 2.43 ms ±9.29% 2.35 ms 3.00 ms Comparison: SQLite.multi_bind_step 807.23 Exqlite.Sqlite3 single insert 411.44 - 1.96x slower +1.19 ms ##### With input 10000 ##### Name ips average deviation median 99th % SQLite.multi_bind_step 84.01 11.90 ms ±15.12% 12.21 ms 14.41 ms Exqlite.Sqlite3 single insert 41.14 24.31 ms ±3.27% 24.34 ms 26.87 ms Comparison: SQLite.multi_bind_step 84.01 Exqlite.Sqlite3 single insert 41.14 - 2.04x slower +12.40 ms ```

And I forgot about UPDATE/DELETE (bad ClickHouse influence)

For UPDATE/DELETE, indeed, we'd need to do yielding nifs that would check if the caller pid is still alive. I'll try it in my fork.

What my hope is that with the new implementation that you are working on @ruslandoga we are not going to timeout the connection at all, but instead keep it open until the request finishes.

That's my current approach. And additionally, new writes don't start until the previous one returns.

Although it looked like it timed out, instead it was still running in the background and causing compounding latency issues.

Actually, this leads me to wonder why there were compounding issues in the timed out query case since SQLite wouldn't allow parallel write transactions either and with busy_timeout pragma set (in exqlite master it's set to two seconds by default) it would "queue" them in the same way. Meaning the latencies would've had stayed bounded to two seconds for the writes until the long write query finished. I guess there were some retries involved which made the long query get re-queued.

warmwaffles commented 12 months ago

ecto_sqlite3 starts two processes, a reader and a writer (for a readwrite db), with reader multiplexing select statements and waiting for transaction close on changes to db, and with writer just queueing writes and Repo.query calls

I'll try to finish it up by the end of the week and open new PRs.

This is intriguing to me. I thought it was required to implement db_connection, but looking at it now, it's not really necessary.

ruslandoga commented 11 months ago

ecto_sqlite3 starts two processes, a reader and a writer (for a readwrite db), with reader multiplexing select statements and waiting for transaction close on changes to db, and with writer just queueing writes and Repo.query calls

This doesn't seem to improve much as I'm checking pragma data_version before each SELECT which slows down small queries ~x2. And without checking pragma data_version before each SELECT we run into a possibility of failing to read own writes. So it seems like multiplexing SELECT statements on the same connection won't work.

binch/fetch_all.exs

```elixir rows = fn count -> Enum.map(1..count, fn i -> [i, to_string(i), i / 1.0] end) end Benchee.run( %{ "data_version + SQLite.fetch_all (prepared)" => {fn {db, data_version_stmt, stmt} -> {:ok, _} = SQLite.fetch_all(db, data_version_stmt, 10) {:ok, _rows} = SQLite.fetch_all(db, stmt, 100) end, before_scenario: fn input -> {:ok, db} = SQLite.open(~c":memory:", 0x2) :ok = SQLite.execute(db, "create table demo(a integer, b text, c real) strict") :ok = SQLite.insert_all(db, "insert into demo(a, b, c) values (?, ?, ?)", input) {:ok, data_version_stmt} = SQLite.prepare(db, "pragma data_version") {:ok, stmt} = SQLite.prepare(db, "select * from demo") {db, data_version_stmt, stmt} end, after_scenario: fn {db, _data_version_stmt, _stmt} -> :ok = SQLite.close(db) end}, "SQLite.fetch_all (prepared)" => {fn {db, stmt} -> {:ok, _rows} = SQLite.fetch_all(db, stmt, 100) end, before_scenario: fn input -> {:ok, db} = SQLite.open(~c":memory:", 0x2) :ok = SQLite.execute(db, "create table demo(a integer, b text, c real) strict") :ok = SQLite.insert_all(db, "insert into demo(a, b, c) values (?, ?, ?)", input) {:ok, stmt} = SQLite.prepare(db, "select * from demo") {db, stmt} end, after_scenario: fn {db, _stmt} -> :ok = SQLite.close(db) end}, "Exqlite.Sqlite3.fetch_all" => {fn {db, stmt} -> {:ok, _rows} = Exqlite.Sqlite3.fetch_all(db, stmt, 100) end, before_scenario: fn input -> {:ok, db} = Exqlite.Sqlite3.open(":memory:") :ok = Exqlite.Sqlite3.execute(db, "create table demo(a integer, b text, c real) strict") sql = [ "insert into demo(a, b, c) values ", "(?, ?, ?)" |> List.duplicate(length(input)) |> Enum.intersperse(?,) ] {:ok, stmt} = Exqlite.Sqlite3.prepare(db, sql) :ok = Exqlite.Sqlite3.bind(db, stmt, List.flatten(input)) :done = Exqlite.Sqlite3.step(db, stmt) {:ok, stmt} = Exqlite.Sqlite3.prepare(db, "select * from demo") {db, stmt} end, after_scenario: fn {db, _stmt} -> :ok = Exqlite.Sqlite3.close(db) end} }, inputs: %{ "1" => rows.(1), "5" => rows.(5), "10" => rows.(10), "50" => rows.(50) # "1000" => rows.(1000), # "10000" => rows.(10000) } # profile_after: true ) ```

benchmark results

``` Operating System: macOS CPU Information: Apple M1 Number of Available Cores: 8 Available memory: 8 GB Elixir 1.15.6 Erlang 26.1.1 Benchmark suite executing with the following configuration: warmup: 2 s time: 5 s memory time: 0 ns reduction time: 0 ns parallel: 1 inputs: 1, 10, 5, 50 Estimated total run time: 1.40 min Benchmarking Exqlite.Sqlite3.fetch_all with input 1 ... Benchmarking Exqlite.Sqlite3.fetch_all with input 10 ... Benchmarking Exqlite.Sqlite3.fetch_all with input 5 ... Benchmarking Exqlite.Sqlite3.fetch_all with input 50 ... Benchmarking SQLite.fetch_all (prepared) with input 1 ... Benchmarking SQLite.fetch_all (prepared) with input 10 ... Benchmarking SQLite.fetch_all (prepared) with input 5 ... Benchmarking SQLite.fetch_all (prepared) with input 50 ... Benchmarking data_version + SQLite.fetch_all (prepared) with input 1 ... Benchmarking data_version + SQLite.fetch_all (prepared) with input 10 ... Benchmarking data_version + SQLite.fetch_all (prepared) with input 5 ... Benchmarking data_version + SQLite.fetch_all (prepared) with input 50 ... ##### With input 1 ##### Name ips average deviation median 99th % SQLite.fetch_all (prepared) 380.14 K 2.63 μs ±550.88% 2.50 μs 2.88 μs Exqlite.Sqlite3.fetch_all 330.86 K 3.02 μs ±492.09% 2.46 μs 10.33 μs data_version + SQLite.fetch_all (prepared) 208.34 K 4.80 μs ±303.00% 4.58 μs 5.33 μs Comparison: SQLite.fetch_all (prepared) 380.14 K Exqlite.Sqlite3.fetch_all 330.86 K - 1.15x slower +0.39 μs data_version + SQLite.fetch_all (prepared) 208.34 K - 1.82x slower +2.17 μs ##### With input 10 ##### Name ips average deviation median 99th % SQLite.fetch_all (prepared) 138.75 K 7.21 μs ±203.62% 6.33 μs 18.04 μs Exqlite.Sqlite3.fetch_all 136.85 K 7.31 μs ±183.38% 6.50 μs 18.46 μs data_version + SQLite.fetch_all (prepared) 105.23 K 9.50 μs ±131.89% 8.46 μs 34.25 μs Comparison: SQLite.fetch_all (prepared) 138.75 K Exqlite.Sqlite3.fetch_all 136.85 K - 1.01x slower +0.1000 μs data_version + SQLite.fetch_all (prepared) 105.23 K - 1.32x slower +2.30 μs ##### With input 5 ##### Name ips average deviation median 99th % SQLite.fetch_all (prepared) 217.70 K 4.59 μs ±335.99% 4.17 μs 6.50 μs Exqlite.Sqlite3.fetch_all 217.10 K 4.61 μs ±335.68% 4.21 μs 6.79 μs data_version + SQLite.fetch_all (prepared) 147.42 K 6.78 μs ±186.21% 6.29 μs 12.25 μs Comparison: SQLite.fetch_all (prepared) 217.70 K Exqlite.Sqlite3.fetch_all 217.10 K - 1.00x slower +0.0127 μs data_version + SQLite.fetch_all (prepared) 147.42 K - 1.48x slower +2.19 μs ##### With input 50 ##### Name ips average deviation median 99th % SQLite.fetch_all (prepared) 34.16 K 29.28 μs ±84.26% 24 μs 136.13 μs Exqlite.Sqlite3.fetch_all 33.99 K 29.42 μs ±84.59% 24.21 μs 136.75 μs data_version + SQLite.fetch_all (prepared) 31.10 K 32.15 μs ±81.88% 26.29 μs 115.41 μs Comparison: SQLite.fetch_all (prepared) 34.16 K Exqlite.Sqlite3.fetch_all 33.99 K - 1.00x slower +0.141 μs data_version + SQLite.fetch_all (prepared) 31.10 K - 1.10x slower +2.88 μs ```

josevalim commented 11 months ago

And without checking pragma data_version before each SELECT we run into a possibility of failing to read own writes.

What do you mean, can you expand? I am not well versed on sqlite pragmas. :)

ruslandoga commented 11 months ago

Since ongoing selects piggy back on the same read transaction we need to know when it's safe to issue new reads and when it's not. By safety here I mean reading own writes.If we write and then read from the read-only connection that has started the read transaction before our write, the data we wrote would be invisible to us. So to avoid that I'm checking data_version pragma which, if changed since the last invocation, means other connections wrote something into the database. If it has changed, I'm blocking/queuing checkouts until the current read transaction closes (i.e. until all statements finish). Since I'm checking that pragma before each select, it negatively affects read performance.

josevalim commented 11 months ago

For Ecto, I would say transactions always go to the write-based connection. Would that solve this problem?

ruslandoga commented 11 months ago

Unfortunately this wouldn't help since selects open a read transaction implicitly if there isn't one already. That read transaction is closed when the select statement finishes stepping through rows. If another select is started before the previous one finishes (on the same connection), the transaction is extended.

josevalim commented 11 months ago

I see, thank you, very helpful. So I guess we have no other option than having a single process that we send everything through?

warmwaffles commented 11 months ago

Or perhaps, we just live with the DirtyNIF for now. 😞

josevalim commented 11 months ago

I believe we most likely still need DirtyNIFs. It is more of a question of removing the DBConnection dependency. :)

josevalim commented 9 months ago

I am reading this once more and I am wondering, how we can improve and simplify things. We have three options:

This library only exposes Dirty NIFs. It is up to users to wrap them in processes and guarantee whatever isolation model they care about. The reason why this makes sense is that, even if we provide safe guards here, someone can run two operating system processes and reproduce some of the isolation issues. As long as the NIFs are safe (no segfault, memory leak, etc), we should be good.
This library provides a DBConnection wrapper, with serial reads and serial writes, and a single connection. Which is what happens today.
This library provides a DBConnection wrapper, with a serial connection that allows concurrent reads and serial writes. We allow multiple reads to run at the same time, as soon as there is a write, we wait for all reads to finish, perform the write, and then start executing the reads again. This should be an improvement over the status quo today (and justifies keeping DBConnection).

Pretty much all other designs we tried were not good enough.

Have I summed up things properly? :)

LostKobrakai commented 9 months ago

I don't think the assessment of what is the current state with point 2 is correct: What we currently have with ecto_sqlite3 is DBConnection with concurrent reads and writes. Concurrent writes block at the sqlite level though and are either successfuly serialized with sqlites busy mechanic or some writes run into busy timeouts and crash.

josevalim commented 9 months ago

Thank you @LostKobrakai. If that's the case, then we might as well have gone with Dirty NIFs all the way? What would DBConnection gives us if we allow both concurrent reads and writes?

LostKobrakai commented 9 months ago

To my knowledge that was done to integrate with ecto (hence it being in ecto_sqlite3 and not this repo). Not sure if there would've been other / simpler options to make ecto work with exqlite.

warmwaffles commented 9 months ago

The DbConnection implementation is done in this project. I intended this library to be as a mechanism by which to interact with sqlite directly. My intention was to keep the ecto implementation away from this repo so that in theory a different sqlite NIF library could replace this should it be necessary.

We could strip this implementation down even further and lift the DbConnection out and into the ecto_sqlite3 library.

The DirtyNIF mechanism works decently, until it doesn't. I usually run into issues where a query operation times out via ecto, but it is still running in the background consuming memory with no way to stop it without hard killing the erlang process.

josevalim commented 9 months ago

@warmwaffles I don't think you can replace dirty nifs though. That's the way to do C integration. Your issue with query operations may be addressed by the following:

return a NIF resource (a reference) on every query from the NIF
pass this NIF resource to read the query result
set a destructor to the NIF resource that will cancel the query if it was not cancelled yet

WDYT? Could that work? We are solving similar issues in Apache ADBC.

I would probably still say then that the DBConnection abstraction could be moved to the ecto_sqlite3 repo, it is not really necessary for SQLite3 itself, and then it should be debated if ecto_sqlite3 itself needs DBConnection. :)

And in case I haven't seen it before, thank you for your work and everyone's feedback and inputs on the thread. :heart:

elixir-sqlite / exqlite

Replace DirtyNIF execution model with a different mechanism #192