Electron100 / butane

An ORM for Rust with a focus on simplicity and on writing Rust, not SQL
Apache License 2.0
91 stars 13 forks source link

Implement async support #22

Open aDogCalledSpot opened 1 year ago

aDogCalledSpot commented 1 year ago

Async support wouldn't only be good because async is useful for anything using the network but also because using the blocking postgres backend is currently not possible when a butane transaction is made from inside an async fn.

In the postgres library, there are a few instances such as this one where an own executor creates a blocking call to the network interface. If the function butane is called from is already async, then this will lead to a second executor being started and the program panics.

Plans for this PR:

@sssemil and I would like to do the following:

Limitations

Let us know if there is anything you think we should be aware of or that you would like to discuss.

Related: https://github.com/Electron100/butane/issues/13

Electron100 commented 1 year ago

Hi @aDogCalledSpot, thanks for considering taking this on. I'm broadly aligned here and would welcome a PR. I'd actually just been thinking about the future of Butane this month, and with Rust's async story maturing in the time since I created Butane and becoming by far the norm in web servers (although with sqlite support and object-persistence focus, Butane isn't aimed solely at web servers) and in newer ORMs (like SeaORM) plus the second-executor panic, I've come fully around to the view that Butane needs to be primarily async. Let me try to get a post up in the next couple days with my overall thoughts on the future/direction.

I'm currently adjusting to life with a newborn so my own code-writing is going to be slow at the moment, but I'll review PRs as quickly as possible.

I agree with the async-trait dependency.

The one thing I'm a little torn about is whether we want to preserve a sync option for use with the sqlite backend. It would make things a little more complex (or at least larger) but I think there could be value in it for simpler apps that don't need/want async. I won't block this PR on adding that option, but I might do it myself before releasing a new crate version.

aDogCalledSpot commented 1 year ago

Sounds good! We'll get started on it tomorrow :+1:

As to the sync option: I think it will be easier to assess the workload for keeping both options once this PR moves ahead.

I actually hadn't heard of SeaORM. Using SQLx as a backend looks interesting since then we would have async support for sqlite as well. We'll play around a bit with it.

aDogCalledSpot commented 1 year ago

Switching to tokio-postgres is pretty straightforward. Sqlite however is a bit annoying since the Connection isn't Sync/Send and therefore can't be used for any futures. Async traits expect to be used with futures so I don't really see an elegant solution using rusqlite. We can wrap rusqlite::Connection in a mutex because rusqlite::Connection is Send but not Sync but it adds an overhead with no real benefit besides "it compiles".

@sssemil has been looking into SQLx a bit more and they offer an AnyConnection type which already abstracts across the backend. I think it's worth exploring implementing the butane interfaces for AnyConnection and then having a unified backend which is designed for async and even offers support for MySQL.

Electron100 commented 1 year ago

Thanks, this is looking good so far! I really appreciate you picking this up.

aDogCalledSpot commented 1 year ago

Postgres is moving forward nicely, however I'm blocked by batch_execute not being in the async GenericClient trait. MR for fix.

Rusqlite is unfortunately causing issues. I decided to use a mutex for guarding the connection so there wouldn't be any issues when doing something like:

res_a = conn.query(...).await;
res_b = conn.query(...).await;

However, transactions hold a reference to the connection as well and since these are implemented in rusqlite, there is no way to add a mutex there. So two transactions acting upon the same connection concurrently would cause races.

I wanted to have a look at how it's done at diesel_async but found out that SQLite isn't supported there. Only MySQL and Postgres are and I guess this is why.

How would you like to handle this?

Electron100 commented 1 year ago

It's a good question. I poked at it a little yesterday but didn't get super far. It occurs to me that one gross would be to run the actual sqlite operations on a dedicated thread pool and interface with the async methods over channels, but aside from high complexity I imagine that would have undesirable perf.

It might be worth taking a look at what https://crates.io/crates/sqlx is doing. Or if we preserve the option to run sqlite in sync mode, maybe async-mode sqlite should just go to sqlx rather than rusqlite.

sssemil commented 1 year ago

It might be worth taking a look at what https://crates.io/crates/sqlx is doing. Or if we preserve the option to run sqlite in sync mode, maybe async-mode sqlite should just go to sqlx rather than rusqlite.

Why not then use sqlx for everything else?

Electron100 commented 1 year ago

Why not then use sqlx for everything else?

Butane has a couple different use cases

  1. Backend webservers. Async is typically desirable here. Postgres will be a more common database choice (or other traditional DBs that Butane doesn't support yet but could). Some simpler apps may use SQLite in a web backend (I've done it myself in hobby projects) but it's probably less common.
  2. Self-contained apps that just need a simple object persistence system. Async is probably not desirable here, but sqlite will likely be the most sensible app.

Because of (2), as I mentioned above, I'd like to preserve the option to have non-async capabilities at least with the SQLite backend before releasing a new breaking-version (although it doesn't have to be in this PR).

Getting back to sqlx. . .I'm a bit reluctant to take an overall non-optional dependency on sqlx because I've hear anecdotes about it having very poor performance, and it's also quite heavyweight in terms of dependencies (not that Butane is lean, but bringing in a whole new pile of transitive dependencies isn't preferable). But async-with-sqlite has possible niche use but I don't think it will be common. So if sqlx is the most expedient way to make that happen I'm more open to it just for that case.

Related: https://github.com/rusqlite/rusqlite/issues/697 Also found https://docs.rs/tokio-rusqlite/latest/tokio_rusqlite/ although I don't think it allows transactions to escape a single call. It uses the channel approach I speculated about above. Not sure what the perf is like. We might be able to rework transactions in Butane a bit to make all the rusqlite calls at once...

aDogCalledSpot commented 1 year ago

The problem I see is that maintaining sync and async code side-by-side will result in a lot of duplicate code that can't be extracted because the function signatures differ in their async prefix and a lot of the lines in the function will differ in whether they need to be awaited or not. The tokio-postgres interface, for example, is a separate folder that is nearly identical to the non-postgres version.

Having a separate subcrate like that could be done in this MR and we would just implement Postgres for now and define features accordingly. We could also have a look at tokio-rusqlite.

aDogCalledSpot commented 1 year ago

My MR mentioned above got merged but the version number for tokio-postgres hasn't been bumped up yet. We can use the git link as dependency for now but will need to check for when it's stabilized. Maybe ask the maintainer to bump up the number if it hasn't been done in the next few weeks.

jayvdb commented 1 year ago

Could this PR be updated, and using the git dependency until the upstream fix has been released. I would like to be using this PR's branch in the interim.

Electron100 commented 1 year ago

@aDogCalledSpot I think perhaps what makes the most sense is for me to create a branch called async and you can merge so we can work on this in pieces. You can merge your Postgres-only async solution in there and then I can work on sqlite solutions. Does that seem reasonable?

aDogCalledSpot commented 1 year ago

Sounds good! I hope to find some more time this week as I've unfortunately been very busy the last few weeks. I don't think there's much point in you starting on the sqlite part until I've come further with the postgres part since that should sort out most problems with the interfaces.

aDogCalledSpot commented 1 year ago

I was unfortunately sick for a long time and didn't get around to working on it much. I finally made it compile today. So I think the interfaces should largely be correct now. I'll keep you updated on how it goes with testing and revisiting everything to see if some stuff can maybe be simplified.