launchbadge / sqlx

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.
Apache License 2.0
13.47k stars 1.28k forks source link

Proposal: New Architecture for `sqlx::query!()` Macros #1598

Open abonander opened 2 years ago

abonander commented 2 years ago

Status Quo

Currently, the default mode of operation for the sqlx::query!() family of macros is to connect to a running database server (or open that database in-process for SQLite) as specified by the DATABASE_URL environment variable, and then use database-specific commands to parse and analyze the passed query, to get information on bind parameters and output columns and generate code based on that information.

As an option, the macros can also operate in an "offline" mode where instead of connecting to a database server, they read a sqlx-data.json file created by cargo sqlx prepare which contains the same data and use that to generate code.

The implementation is relatively straightforward and functions well enough, but has some drawbacks:

Compiling a project with a lot of query!() macro invocations is painfully slow.

Since each macro invocation involves opening and working with a TCP connection to a database server from within an asynchronous runtime, the query macros execute a lot more code than your typical procedural macro. This is compounded by the fact that procedural macros are exclusively compiled in debug mode which means no optimizations (which async code relies pretty heavily on to be fast).

I have tried playing around with setting rustflags in a config.toml to force sqlx-macros to compile in release mode but it didn't really help and seemed to actually worsen compile times if it did anything at all; the most noticeable thing that it did was clobber all the cached compiler artifacts for the whole dependency tree, requiring a full recompile.

Macro invocations are currently executed serially which further compounds the problem.

Currently as a mitigation we only start a single runtime and then cache it in a lazy_static!(); since proc macros are executed in a separate process that is kept running for the duration of the compiler, this saves some overhead per invocation. I've been meaning to see if caching a Pool of connections saves any overhead as well. I imagine it would, at least somewhat. We could cache analysis results as well, but that would only help for queries that are copy-pasted a lot.

Many people find the idea of procedural macros opening arbitrary TCP connections to be highly unpalatable.

Oftentimes when someone comes across SQLx and starts reading about the query macros, their first reaction is something along the lines of, "wow, the compiler actually lets you do that? That's disgusting!"

Fair enough, the current procedural macro environment is extremely laissez-faire and there have been calls to lock down what procedural macros (and build scripts) can actually do, thanks to sensationalized proof-of-concept projects showing how a proc macro executed by a transitive dependency could steal sensitive information like SSH keys and the like.

Personally, I think the threat is kind of overblown, or at least rather moot, considering a malicious crate in your dependency tree could accomplish all kinds of nastiness already, just at runtime instead of compile time. I fear that a big push to sandbox procedural macros and build scripts could lure people into a false sense of security and make them feel like they don't need to audit their dependencies. But, I digress.

The query!() macros don't play that well with IDEs.

This situation has improved drastically compared to when we first introduced SQLx, to the point where IntelliJ-Rust and Rust Analyzer can both provide code completion for sqlx::query!(). However, it can still take some time to get results, in part thanks to point 1 here, and the macros aren't "pure" in the sense that their output could change from run to run based on schema changes, without the compiler or IDEs being able to pick up on it. They should be working from cached output where possible, and be able to invalidate that cache in a way that IDE plugins can learn about.

Offline mode is finicky to use and it's easy to forget to run cargo sqlx prepare before committing.

Offline mode isn't easily discoverable (in part thanks to just how dense the documentation on sqlx::query!() has gotten... we really need to move that stuff to an MDBook or something). It's also easy to forget to run cargo sqlx prepare before committing, which often leads to really confusing CI errors.

In the past, what I've done is add cargo sqlx prepare && git add sqlx-data.json to .git/hooks/precommit to make sure I don't forget, but this makes committing take forever thanks to point 1.

The macros don't have a way to ensure that the database is up-to-date with migrations.

This is somewhat on purpose as we don't want the macros automatically executing migrations which may end up breaking the user's local development database and/or clobbering manually generated test data. Still, it does get annoying.

Proposal

Instead of having the macros connect directly to the database server at compile time, they could communicate with a daemon process that the user starts and leaves running while working on their project.

I'm imagining this daemon process as another subcommand of sqlx-cli, perhaps cargo sqlx watch or something like that. It would watch target/sqlx which the macros would write their query strings to as files. I considered using a crate like interprocess to let the macros and daemon talk directly, but I think communicating with the filesystem is better because that's a medium that the compiler natively understands as I'll elaborate on in a second.

Since a cargo installed binary is built in release mode by default, the latency of the actual request to parse and analyze the query should improve quite a bit.

The macros could submit their request to the daemon and then return some dummy code (or even a temporary error, like Waiting for query analysis. Is `cargo sqlx watch` running?), telling the compiler to watch target/sqlx for the daemon to write the result of processing the query. Then the compiler could move on to the next macro invocation instead of blocking on the completion of the first one. This would let the macros essentially run in parallel, which should significantly reduce overall compile times.

Assuming #570 is implemented as part of this rearchitecting, the daemon could also automatically update entries in .sqlx/ for successfully parsed queries so that cargo sqlx prepare doesn't have to be manually run before committing.

This would also be more amenable to sandboxing as we would only require write access to target/ and read access to the project, which wouldn't be doing anything more than what build scripts are already expected to do, and any sandboxing scheme that is implemented would be required to support. Support for Windows named pipes or Unix domain sockets would probably be right out in this case unless we specifically asked for it, and that would be an uphill battle.

The daemon could also periodically re-run the analysis of queries to pick up changes to the database schema.

@mehcode suggested that the daemon could automatically migrate the database if necessary, so no other manual setup would be needed. It could even spin up a temporary database so migrations could be automatically tested and synchronized with the query macros without breaking a local development database.

abonander commented 2 years ago

I'm wondering if there's a way that we can support incremental adoption of cargo sqlx watch into people's workflows. It is yet another process that needs to be running to compile your code, and some might find that annoying regardless of the potential benefits.

Perhaps cargo sqlx watch could create target/sqlx/watch.lock or something as a temporary file and the macros could just check if that file exists or not to know if cargo sqlx watch is running. If the file doesn't exist, they can fall back to their current behavior.

jplatte commented 2 years ago

If #1183 could be finished and the workspace situation improved (see discussion in #1223), I think that would solve my own pain points. I'm not convinced a separate daemon would make working with SQLx easier at all.

For context: The database part of the service I work on is updated infrequently enough that an offline-first workflow works well (.env contains SQLX_OFFLINE=true and the macros only connect to the DB as part of cargo sqlx prepare).

abonander commented 2 years ago

@jplatte hmm, then as a compromise, maybe the macros could use the cached data first if it exists and the migrations haven't changed. Then only the modified macro invocation will actually have to talk to the database.

It's certainly worth trying as a first step at least.

bennofs commented 2 years ago

I have thought a lot about how to handle these compile-time query checking features (not in rust, but also in other languages). Needing access to a database of the same kind as the production base during builds is almost always hard and leads to complex build system. This is why I want to suggest an alternate approach:

So in the end, you generate a check_compat(connection) method that the compiled program can use to check whether the DB satisfies all the things it expects directly during startup and fail fast if it does not. You could also provide a CLI command that does the same. To me, this has the following advantages:

Is there any use case where getting the error directly at application startup (or by running the checker function directly via cli after build or even as an automatic post-build step) is significantly worse than getting them during build? The main motivation for compile-time query checking from my experience is that you want to avoid situations where you first have to trigger a certain code path to see the error, which leads to long cycle times and is inconvenient. But if you can check all queries directly at application startup or with a CLI command, that solves the same issue without the disadvantages of compile-time network access.

jplatte commented 2 years ago

during compile time, only record assertions that need to hold on the DB for queries to be valid

This is quite simply not feasible. SQLx checks the queries directly against the database (or against cached metadata previously obtained from the database) because the alternative is parsing / interpreting the query string, which could be using many, many different syntax constructs and features depending on the DB and even how it is set up.

As a little thought experiment, consider this simple query: select my_function(). It's super basic, but at the same time it's impossible to know what type it will return for software other than the actual database you would use it with.

pythoneer commented 2 years ago

Not to derail the conversation topic to badly but i see a use case in the mechanics that @bennofs proposed with check_compat(connection) but i also agree with @jplatte that this is not really the way to go. I just imagined (and its probably the best to create a new issue for this) a way of having both. I am not aware if the current cli-tool or any migrate-style macro handles this but it could be helpful if you can take a sqlx-data.json created in the CI-Job or on your dev machine to the deployment target and have the cli-tool use the sqlx-data.json you brought with you and "compare" it against the live database and/or have a migrate-like macro to check the brought with you sqlx-data.json against the live database before starting? This would give you more confidence if you can't build against the live database (which isn't a good idea anyway i think).

Edit:
maybe just generating a second sqlx-data.json on the deployment machine and just diffing it is enough – idk how "stable" the sqlx-data.json output is.

jplatte commented 2 years ago

@pythoneer It sounds like you are describing cargo sqlx prepare --check, which is already a thing and useful also as a CI job to make sure any new migrations are accounted for in the code (and the query cache).

bennofs commented 2 years ago

As a little thought experiment, consider this simple query: select my_function(). It's super basic, but at the same time it's impossible to know what type it will return for software other than the actual database you would use it with.

Hmm, couldn't you specify the return type explicitly in the code? Are there many other things apart from return types and argument types that you'd need to know about the query? Ideally you'd be able to record the (parametrized) query, return and argument types and then check against the database that those types are convertible to/from the the actual query parameters/return value.

jplatte commented 2 years ago

Hmm, couldn't you specify the return type explicitly in the code?

In theory yes, but that would be a lot of unnecessary extra boilerplate.

Are there many other things apart from return types and argument types that you'd need to know about the query?

Yes, a bunch of things. And you would have to be able to fully parse the query to get that information, which may be reasonable for a simple DB, but SQLx supports multiple very complex databases. Even "just" a self-made parser for PostgreSQL's SQL flavor would be a huge project.

bennofs commented 2 years ago

I definitely agree that parsing queries should be avoided. As far as I can see, with query_as!, the rust types of output/input are known (parameters must be known because they are rust values, and output type is determined by the named struct). But yes, specifying the type for the query! macro could become annoying, although personally I'd take the tradeoff of having to specify types explicitly in my code vs needing to keep files depending on db schema up to date.

Anyway, this is not core of sqlx so I may just try to implement that myself as a separate crate, and see how well it works :)

abonander commented 2 years ago

As far as I can see, with query_as!, the rust types of output/input are known (parameters must be known because they are rust values, and output type is determined by the named struct)

The processing is still driven by the metadata returned from the database. Proc macros can't introspect any code that isn't in their input, so all query_as!() does it take the name of the struct and emit a literal value for it the same way query!() does, but without generating an ad-hoc struct first.

It will happily emit fields that aren't in the struct or omit ones that are if they aren't in the query, and it's up to the compiler to sanity check that.

loafofpiecrust commented 2 years ago

At least considering Postgres, couldn't this tool use an exported schema using pg_dump -s database or a custom JSON format that isn't dependent on specific queries? This way, you could run cargo sqlx prepare only once after all migrations are run on the DB, then the file it generates can be used to validate any query. Then I don't have to generate whenever I change queries. Would that be too slow or complex? I like some of these ideas around automating testing for migrations w/o breaking current database, but I'm honestly not sure if I love running another daemon for project development.

abonander commented 2 years ago

@loafofpiecrust have a look at this FAQ answer: https://github.com/launchbadge/sqlx/blob/master/FAQ.md#why-cant-sqlx-just-look-at-my-database-schemamigrations-and-parse-the-sql-itself

It's a bit snarky but the suggestion comes up a lot and so it gets a bit annoying to respond to.

loafofpiecrust commented 2 years ago

Yup @abonander sorry for the noise.

andoriyu commented 2 years ago

Also related, I find using query! cumbersome sometimes:

I stopped using it due to that, but then I added a crate dependency that uses it, and it's even worse - it is misusing features and provides offline support only for PostgreSQL.

I think you're on the right track that macro opening any kind of network connections is pretty wild design. Is there any way to do something like:

I'd rather not daemon:

tyrelr commented 1 year ago

This is an interesting idea, and is a much bigger idea than my thought of having a sqlx describe command to inspect the results of a query.

I do see significant potential advantages to separating the query analysis from the rust compilation:

The potential disadvantages I see are:

Relying on something resembling 'offline' mode, with an external tool to do the sql query analysis, seems like a better default than the current default.

For how I use sqlx, I would still prefer if cargo run/check was capable of handling everything, even if it was opt-in. But even if that was functionality was lost, sqlx would STILL be better than only finding out about a typo after the application tries to run the query.