launchbadge / sqlx

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.
Apache License 2.0
13.54k stars 1.29k forks source link

Question: best way to use sqlx with connections to two different databases? #121

Open sbditto85 opened 4 years ago

sbditto85 commented 4 years ago

I have a program with two databases. One MySQL and one Postgres where I am syncing data from one to the other. Is that possible with sqlx? What is the best way to do that since the query macro has a hardcoded DATABASE_URL env var to determine the database information?

mehcode commented 4 years ago

We don't currently have a solution to this for the query macros beyond recommending to use multiple crates so the compilation units would be different.


I know @abonander has some ideas how to make this nicer so I'm marking this as an enhancement.

PoiScript commented 4 years ago

A workaround is to use cargo workspace: create one crate for MySql, one for Postgres, and third that imports them. So you can have different DATABASE_URLs for each one.

mehcode commented 4 years ago

Copying over my thoughts from #224 as I realized we didn't put these up here.


We were thinking a decent approach would be to generate a set of macros with a higher level macro.. e.g.,

mod one { sqlx::database!(env!("ONE_DATABASE_URL")); }
mod two { sqlx::database!(env!("TWO_DATABASE_URL")); }

one::query!("...") // mysql://...
two::query!("...") // postgres://...
PoiScript commented 4 years ago

@mehcode To be honest, I don't like the idea of using macro here. Because it seems to increase both compile time and build size.

Besides, I think a flexible and extensible approach is using SQL comment. e.g.

sqlx::query!(
    r#"
        -- DATABASE_URL=postgres://...
        select * from foo
    "#
)

will specifies the DATABASE_URL, or

sqlx::query!(
    r#"
        -- DATABASE=bar
        select * from foo
    "#
)

will tells sqlx to use DATABASE_URL_bar instead.

jsdw commented 4 years ago

I ran into this issue as well as I wanted separate postgres databases for a microserverish approach (I've worked around it for now using schemas).

Using a comment, apart from being pretty verbose, feels too magical to me. Being able to customise or generate the query! and friends macros feels more readable to me (plus it might be nice to also be able to pull the URL from sources other than an env var?).

sbditto85 commented 4 years ago

My 0.02. I'd rather just have something like sqlx::query!([sql string], [optional db url or env var], [params]). If you leave it off then it uses the current behavior of DATABASE_URL env var, but if you supply the optional env var it will use that instead or you could just put a const string there and no env vars would be used.

so perhaps:

sqlx::query!("select * from table where id=?", env=OTHER_DB, 1); where 1 is the parameter but the env= dictates its not part of the params.

sqlx::query!("select * from table where id=?", url="mysql://blah:blah@blah/blah", 1); again with 1 as the parameter and url= is some const string known at compile time. Hopefully this would allow us to use a const expression to generate the url if needed.

Not sure if that would cause the macro to be too complicated, but something like that would be ideal for me. I can easily see that the query is being directed from a place different then the default fairly easily and hopefully doesn't add too much computational size or overhead.

abonander commented 4 years ago

The idea with the sqlx::database!() macro is that it wouldn't be another proc macro, it's just a macro_rules! macro that simply shadows the facade macros in src/macros.rs with ones that tell the proc macro internally which environment variable to look at, so it's actually somewhat of a hybrid approach.

Have a look back at @mehcode's example usage; the idea is that by invoking the macro in a submodule you can namespace the query macros so you know at a glance which DB it's talking to:

mod one { sqlx::database!(env!("ONE_DATABASE_URL")); }
mod two { sqlx::database!(env!("TWO_DATABASE_URL")); }

one::query!("...") // mysql://...
two::query!("...") // postgres://...

It doesn't make sense to me to hardcode the database URL at the use site; you'd have to remember to include the URL or the env var every time you write another query!() invocation, which seems extremely cumbersome.

In fact, I'd like to refactor the proc macro itself so that we don't have 4 separate proc macro definitions which all do about the same thing. Instead, I'm thinking the facade will pass in key-value pairs that dictate its behavior.

mehcode commented 4 years ago

Another thought is the database macro could configure all sorts of knobs. One in particular could be adding or overriding user-defined types.

abonander commented 4 years ago

Like setting tinyint in MySQL to always be bool, maybe.

It should probably be called config!().

PoiScript commented 4 years ago

I suggest using sql comment simply because it's extendable. It's is definitely better If we can have a config!() macro with optional named parameters like:

config!(
  DATABASE_URL = "...",
  // ... more
)
abonander commented 4 years ago

The comment would be significantly more complex to implement because we would have to parse the SQL in a comment-aware manner, which probably means rolling our own parser. We currently treat the string as opaque and I'd like to keep it that way unless we have a really good reason.

abonander commented 4 years ago

If we did have a compelling reason to implement per-invocation config, I think I'd prefer it as attributes inside the invocation, which we can parse with syn:

sqlx:;query!(
    #[db(env = "OTHER_ENV_VAR")]
    "select ..."
);

However, I think the config!() macro is strictly more useful because you don't have to repeat your customization options every time you call query!(). I don't see a reason why not to support both except the increased complexity involved (which is already a problem for the query!() macros).

sbditto85 commented 4 years ago

I can see the benefits of a config macro in a module and then using that module to indicate which config you were trying to use. I just want it to be obvious and hopefully hard to screw up which database you are using for the query.

I am waaayyy to good and making those mistakes :p

abonander commented 4 years ago

I'm feeling like config!() is perhaps a bit vague; if it's not going to be used very much maybe we should give it a very clear name like shadow_macros!() or something.

vkill commented 4 years ago

I think to use multiple crates is better.

mehcode commented 4 years ago

After thinking on this more, I'm not saying I don't want to do something with a meta-macro but multiple crates definitely feels like the cleanest solution to me. With #267, separate .env files can be setup to compile each crate in the workspace under a different DATABASE_URL.

qiuzhanghua commented 4 years ago

Suppose we have an application to transfer data from PostgreSQL to MySQL, using multiple crates isn't helpful, especial when I use query! to compile or run. I prefer

mod one { sqlx::database!(env!("ONE_DATABASE_URL")); }
mod two { sqlx::database!(env!("TWO_DATABASE_URL")); }

one::query!("...") // mysql://...
two::query!("...") // postgres://...
mehcode commented 4 years ago

The idea with multiple crates is you'd have the query methods inside each crate. So if you want to query both at the same time that'd be a minimum of three crates. Yeah, I can see how that's not ideal but it is clean.

The macros are going through an overhaul with the offline stuff but afterwords I'm sure we can revisit this. I think the only issue here is what to call sqlx::database! and how much power it should have.

abonander commented 4 years ago

I'm taking this feature into account with the offline refactor, this will probably be the next thing I tackle.

abonander commented 4 years ago

Ideas for names (I don't mind a longer, more descriptive name as this should be used at most a handful of times in a given project):

Victov commented 4 years ago

I'm running into the similar issues here. My application is essentially a bridge between not 2 but many more databases. Has any progress been made on this since may? I understand the multiple crates solution is available, but that is a little hard to scale and maintain for a use case with many different database connections.

I understand that multiple connections can be supported if you don't use compile-time checked queries, but unfortunately the sqlx-cli tool (which I really love) also depends on the hardcoded DATABASE_URL.

john-consumable commented 3 years ago

I made a patch to allow specifying which env var is used in query_as. Please feel free to use it while progress is made on something more official.

use like so: query_as!("MY_OTHER_DB_URL", <my query>, [vars ])

diff --git a/sqlx-macros/src/query/input.rs b/sqlx-macros/src/query/input.rs
index ddcd6d4a..138f3e75 100644
--- a/sqlx-macros/src/query/input.rs
+++ b/sqlx-macros/src/query/input.rs
@@ -8,6 +8,8 @@ use syn::{ExprArray, Type};

 /// Macro input shared by `query!()` and `query_file!()`
 pub struct QueryMacroInput {
+    pub(super) env_var: Option<String>,
+
     pub(super) src: String,

     #[cfg_attr(not(feature = "offline"), allow(dead_code))]
@@ -36,6 +38,7 @@ impl Parse for QueryMacroInput {
         let mut args: Option<Vec<Expr>> = None;
         let mut record_type = RecordType::Generated;
         let mut checked = true;
+        let mut env_var = None;

         let mut expect_comma = false;

@@ -48,7 +51,9 @@ impl Parse for QueryMacroInput {

             let _ = input.parse::<syn::token::Eq>()?;

-            if key == "source" {
+            if key == "env_var" {
+                env_var = Some(input.parse::<LitStr>()?.value());
+            } else if key == "source" {
                 let span = input.span();
                 let query_str = Punctuated::<LitStr, Token![+]>::parse_separated_nonempty(input)?
                     .iter()
@@ -80,6 +85,7 @@ impl Parse for QueryMacroInput {
         let arg_exprs = args.unwrap_or_default();

         Ok(QueryMacroInput {
+            env_var,
             src: src.resolve(src_span)?,
             src_span,
             record_type,
diff --git a/sqlx-macros/src/query/mod.rs b/sqlx-macros/src/query/mod.rs
index 3ccab313..1ddfd24a 100644
--- a/sqlx-macros/src/query/mod.rs
+++ b/sqlx-macros/src/query/mod.rs
@@ -34,27 +34,33 @@ pub fn expand_input(input: QueryMacroInput) -> crate::Result<TokenStream> {
             .map_err(|e| format!("failed to load environment from {:?}, {}", env_path, e))?
     }

+    let db_var = input
+        .env_var
+        .as_ref()
+        .map_or_else(|| "DATABASE_URL".to_string(), |s| s.to_string());
+
     // if `dotenv` wasn't initialized by the above we make sure to do it here
     match (
         dotenv::var("SQLX_OFFLINE")
             .map(|s| s.to_lowercase() == "true")
             .unwrap_or(false),
-        dotenv::var("DATABASE_URL"),
+        dotenv::var(&db_var),
     ) {
         (false, Ok(db_url)) => expand_from_db(input, &db_url),

         #[cfg(feature = "offline")]
-        _ => {
+        (_, db_url) => {
             let data_file_path = std::path::Path::new(&manifest_dir).join("sqlx-data.json");

             if data_file_path.exists() {
                 expand_from_file(input, data_file_path)
             } else {
-                Err(
-                    "`DATABASE_URL` must be set, or `cargo sqlx prepare` must have been run \
-                     and sqlx-data.json must exist, to use query macros"
-                        .into(),
+                Err(format!(
+                    "`{}` must be set, or `cargo sqlx prepare` must have been run \
+                         and sqlx-data.json must exist, to use query macros",
+                    &db_var
                 )
+                .into())
             }
         }

@@ -64,7 +70,7 @@ pub fn expand_input(input: QueryMacroInput) -> crate::Result<TokenStream> {
         }

         #[cfg(not(feature = "offline"))]
-        (false, Err(_)) => Err("`DATABASE_URL` must be set to use query macros".into()),
+        (false, Err(_)) => Err(format!("`{}` must be set to use query macros", &db_var,).into()),
     }
 }

diff --git a/src/macros.rs b/src/macros.rs
index 876e8fcb..3262c641 100644
--- a/src/macros.rs
+++ b/src/macros.rs
@@ -512,6 +512,12 @@ macro_rules! query_file_unchecked (
 #[macro_export]
 #[cfg_attr(docsrs, doc(cfg(feature = "macros")))]
 macro_rules! query_as (
+    ($var:expr, $out_struct:path, $query:expr) => ( {
+        $crate::sqlx_macros::expand_query!(env_var = $var, record = $out_struct, source = $query)
+    });
+    ($var:expr, $out_struct:path, $query:expr, $($args:tt)*) => ( {
+        $crate::sqlx_macros::expand_query!(env_var = $var, record = $out_struct, source = $query, args = [$($args)*])
+    });
     ($out_struct:path, $query:expr) => ( {
         $crate::sqlx_macros::expand_query!(record = $out_struct, source = $query)
     });
lweberk commented 3 years ago

In the case of the workaround by splitting into independent workspace crates, each with its .env; How do you handle the case for offline building, for example in CI/CD?.

I've so far tried to generate the sqlx-data.json in each workspace independently, to no avail. It picks up the DATABASE_URL env var from one of the crates and checks some different one with it, failing while checking (mix of pgsql and mysql in my case). The "--merge" option does not change any of this, the problem being the environment variables handling.

Suggestions?

Opened #1223

failable commented 1 year ago

What's the best solution to this in 2023?

For example, I'd like to query History.db, CloudTabs.db, ... in the same Safari struct.

maackle commented 1 year ago

Just chiming in that I'd also like to see better multi-db support. Our project uses many SQLite databases, all created at runtime, so one-crate-per-DB is not even possible. We only have 7 or so distinct schemas, so we could conceivably have separate crates for each of those, with an empty db for each just to have the right schema for compiletime verification, but even so that feels heavy handed, and we also can't make use of migrations against the actual databases (I haven't explored sqlx migrations enough to know if it could even support migrations for runtime-created dbs though).

samuela commented 10 months ago

Is it possible set DATABASE_URL in build.rs? I'm in a situation where I'd like to auto-generate a sqlite database from a schema.sql file at development time, and then use the real database at run time.

EDIT: yes, this build.rs works:

fn main() {
  // Create a new temporary file. We need to `.keep()` it so that it isn't
  // deleted when it goes out of scope at the end of this function.
  let (_, tempfile) = tempfile::NamedTempFile::new().unwrap().keep().unwrap();
  let database_path = tempfile.as_path().to_str().unwrap();

  let schema_file = "../schema.sql";

  // Ensure that the build script is rerun if schema.sql changes
  println!("cargo:rerun-if-changed={schema_file}");

  // Execute SQL commands from schema.sql on the temporary database
  rusqlite::Connection::open(database_path)
    .unwrap()
    .execute_batch(&std::fs::read_to_string(schema_file).unwrap())
    .unwrap();

  // Prepare the DATABASE_URL in the format that sqlx expects for SQLite
  println!("cargo:rustc-env=DATABASE_URL=sqlite:{database_path}");
}
dave42w commented 9 months ago

I'm taking this feature into account with the offline refactor, this will probably be the next thing I tackle.

Obviously, life changes so this didn't happen yet.

However, this would be really helpful for me (as long as #[sqlx::test] works with it).

Victor-N-Suadicani commented 9 months ago

Is it not possible to just make it so you could run cargo sqlx prepare --database-url ... multiple times, one time for each database? First time you run it, half the queries fail and the other half is saved with their query data. Second time you run it, the other half of the queries fail but the other half is saved with the query data. And then you have both sets of query data saved and using the offline check should work?

Or perhaps you allow specifying multiple database-urls and then it tries each database in succession for each query until one succeeds, then uses that to save the query data.

That solution wouldn't need any modification of the current macros, I believe.

jplatte commented 9 months ago

@Victor-N-Suadicani I think the main problem with that is: when / how are stale files in .sqlx cleaned up?

Victor-N-Suadicani commented 9 months ago

Wouldn't specifying multiple database-urls in a single cargo sqlx prepare call as I suggsted above fix that? Then you can still remove any existing files before replacing the query JSONs. But I am just guessing at how it works here.

loganbnielsen commented 6 months ago

commenting on this because it's an issue for me as well

Snapstromegon commented 5 months ago

To add a new idea that I haven't seen in this thread before:

Maybe we could use DATABASE_URL by default, but also support DATABASE_URL_POSTGRES, DATABASE_URL_SQLITE and so on to define a more specific URL per database type.

That way it would be easy to support multiple types of DBs at once (like this issue originally targeted) without introducing any change to the lib consumers and it would be very consistent to use.

Only downside is, that this wouldn't support multiple DBs of the same type (e.g. two postgres DBs).

The two DB types is a very common issue for me, since I'd like to provide some of my software with support for multiple DBs and also use compile time checked queries and right now that's not easy to do.

Snapstromegon commented 5 months ago

As you might have seen, I started an implementation in #3278.

Sadly I'm kind of stuck there, because the sqlx-macros-core::query::expand_input() function has no information about the resulting DB type and therefore it's (as far as I can tell) not possible to select a driver for the query.

If the query family of macros had an additional parameter to determine the type of query (could be optional for cases where multiple DB types are used), it would be easy to resolve this, but that would mean changing the API (although it wouldn't really be breaking, if it's optional).

I implemented this by passing in the Database::NAME into the macro as a String. If someone who has more experience working with macros could take a look at this how one could use the type of the DB (so either the Database type or the pool) instead, I would appreciate that.

esmevane commented 5 months ago

@Snapstromegon Oh neat! That's awesome. I've got a question or two, but I'll leave it on your PR :)

As for macros, when I was starting my macro journey, this article was tremendous for me. I'm not sure where you are on your journey but the testing parts specifically helped me.

Snapstromegon commented 5 months ago

Thanks @esmevane, I'll give it a read. Right now I'm working through Jon Gjengset's Crust of Rust streams around macros and that's ~10h of videos to watch.

jgraef commented 4 months ago

For my usecase I want to have multiple crates using different (sqlite) databases in a single workspace.

Using multiple .env files didn't work well for me. The database URL needed to be relative to the workspace root, regardless of where the .env file was, which works, but is a bit awkward, especially if you also use the sqlx-cli. Also rust-analyzer (in vscodium) didn't like it too much.

I had the idea to set the DATABASE_URL environment variable on a per-crate basis with a build-script:

// my-workspace/some-crate/build.rs
fn main() {
    println!(
        "cargo::rustc-env=DATABASE_URL=sqlite:{}/schema.db",
        std::env::var("CARGO_MANIFEST_DIR").unwrap(),
    );
}

This way I can be sure every crate always uses the right database file.

Edit: Improved build script that runs sqlx migrate run ... to create a schema.db in /target when migrations folder changes:

build.rs ```rust use std::{ path::PathBuf, process::Command, }; fn main() { let crate_dir = PathBuf::from(std::env::var("CARGO_MANIFEST_DIR").unwrap()); let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap()); let migrations_dir = crate_dir.join("migrations"); let db_file = out_dir.join("schema.db"); let db_url = format!("sqlite:{}", db_file.display()); println!("cargo::rerun-if-changed={}", migrations_dir.display()); if db_file.exists() { std::fs::remove_file(&db_file).unwrap(); } std::fs::write(&db_file, b"").unwrap(); let exit_status = Command::new("sqlx") .arg("migrate") .arg("run") .arg("--source") .arg(&migrations_dir) .arg("--database-url") .arg(&db_url) .spawn() .unwrap() .wait() .unwrap(); if !exit_status.success() { panic!("sqlx failed: {exit_status}"); } println!("cargo::rustc-env=DATABASE_URL={}", db_url,); } ```
AustinCase commented 3 months ago

So excited for that PR @aderinom !! We were just about to invest in writing that ourselves. Much appreciated!

abonander commented 2 months ago

Our intended solution for this (and many different use-cases besides) is to allow multiple crates in the same workspace to rename their DATABASE_URL var using sqlx.toml: #3383

I'm working on it as fast as I can, and hope to have it ready for a short alpha cycle for 0.9.0 for people to try it out.

lednhatkhanh commented 3 weeks ago

It's so good to see it's happening!

liamwh commented 2 weeks ago

Would love to see this asap also! Is there any chance of an ETA @abonander?