allow_tables_to_appear_in_same_query! does not scale with larger number of tables

jeff-hiner commented 3 weeks ago

Adding more tables increases the number of generated TableNotEqual trait impls by N squared, since the allow_tables_to_appear_in_same_query macro generates pairwise trait impls. Past about 10-15 tables this N squared codegen has a significant impact on compile time.

Making this more difficult, there's no way to prevent the autogenerated allow_tables_to_appear_in_same_query block except via patch file. We already have an existing patch file to deal with another schema issue, and it doesn't look like there is a way to specify multiple files in diesel.toml.

I'm not sure how to fix this, aside from completely reworking how table checks work. I recognize it's important as part of the type system to not generate queries spanning two disjoint databases, but this is a problem many if not most users won't encounter. Perhaps there's a more scalable way to do the same check with an O(1) trait generation like BelongsToDatabase<T> and some clever bounds checking.

Setup

Versions

Rust: 1.82.0
Diesel: 2.2.3
Database: Postgres
Operating System: Ubuntu 22 LTS

Problem Description

We started noticing this more as we've added more tables, and it got significantly worse after changes to the coherence checker in Rust 1.82.

Compounds with https://github.com/rust-lang/rust/issues/132064

Steps to reproduce

Create a database with a large number of tables, via diesel cli and the normal migrations. In our case we have about 70 tables in our database. The

Generate a library crate with associated autogenerated schema.rs, import it and time the build.

Nightly isn't required to exhibit the problem, but RUSTFLAGS=-Ztime-passes cargo +nightly check shows the majority of time being spent in the coherence checker. This is consistent with a very large number of trait impls (by my estimate, 70 tables with the Postgres backend will generate approximately 50000 trait impls!)

Checklist

[X] I have already looked over the issue tracker and the discussion forum for similar possible closed issues.
[X] This issue can be reproduced on Rust's stable channel. (Your issue will be closed if this is not the case)
[X] This issue can be reproduced without requiring a third party crate

weiznich commented 3 weeks ago

Thanks for opening this issue.

I fear there is nothing much we can do here as that's a fundamental check required for how diesel checks if your queries are valid. At least I'm not aware of a better way to model this in the rust type system. Technically speaking we could possibly offer an option to not generate this macro call, which would have the downside that you would need to write these call for all the relevant tables on your own, which quickly can become complicated and will result in confusing error messages if you miss that.

As for the rustc compile time regression: That's something you should report to the rustc developers as that's their regression. They are usually quite sensitive to such reports, so if you show up with a reproducible report that something got significantly worse I'm quite sure that there will be a response there.

jeff-hiner commented 3 weeks ago

It looks like they're already working on a fix for the regression here: https://github.com/rust-lang/rust/pull/132075

To make sure I understand, the check is there to make sure the queries are against tables/aliases within the same database, right?

weiznich commented 2 weeks ago

To make sure I understand, the check is there to make sure the queries are against tables/aliases within the same database, right?

That's not correct. This trait is used to count how often a table appears in the from clause of your query. That in turn is used to check that all the relevant tables are there and that each table appears only once in a not-aliased variant.

jeff-hiner commented 2 weeks ago

I see. I've been looking into alternatives, because even when the upstream issue is addressed we're throwing a LOT of individually generated traits at the compiler.

If we had some sort of internal trait that exposed a const fn identifier_fn() -> usize it should be possible to insert something like:

const _: () = assert_ne!(T1::identifier_fn(), T2::identifier_fn(), "cannot use {} more than once in this context", std::any::type_name::<T1>());

and it should allow compile-time assertions that T1 != T2. Then it's just a matter of setting up diesel::table! to generate that identifier fn somehow.

I also looked at using something like typewit and TypeNe instead. That seems to allow you to specify a const fn via TypeFn for distinguishing whether two table types T1 and T2 are distinct instead of having to implement each database's tables pairwise, but it's unclear whether it requires the same sort of pairwise trait explosion.

weiznich commented 2 weeks ago

If we had some sort of internal trait that exposed a const fn identifier_fn() -> usize it should be possible to insert something like:

I fear that won't work as you currently cannot declare const functions in traits in stable rust. Even if that's somehow possible you then cannot use the proposed check in the relevant functions as rust disallows using generic parameters from outer items there: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=57f94e490577068c5e7f49452718ec11

I also looked at using something like typewit and TypeNe instead. That seems to allow you to specify a const fn via TypeFn for distinguishing whether two table types T1 and T2 are distinct instead of having to implement each database's tables pairwise, but it's unclear whether it requires the same sort of pairwise trait explosion.

I'm open to accept specific implementations, as long as they don't break the existing public API and can demonstrate that they improve the compile time for something. I likely won't have the capacity to spend much time on that myself.

jeff-hiner commented 2 weeks ago

You're right that you can't declare a const fn, but you can do something like this in a trait as long as you have some statically accessible const context accessible for the check.

const _CONSTRAINT: () = assert!(
    !are_equal(T1::SQL_NAME, T2::SQL_NAME),
    "cannot use identifier more than once in this context",
);

While playing around I'm using the table name here (#sql_name in the table! context) and just added a const SQL_NAME: &'static str; to Table. I had to write a const fn are_equal because it's not yet possible to compare two static str directly. This is pretty dumb, but appears to work:

/// We need this because equality is only implemented in Rust for primitive types.
const fn are_equal(a: &'static str, b: &'static str) -> bool {
    let a = a.as_bytes();
    let b = b.as_bytes();

    matches!(a, b)
}

If you want to do this sort of test within the trait impl, you simply need to make sure there's a const available to override within the impl.

pub trait Foo {
    // ...
    const _CONSTRAINT: () = ();
}

So in this case you would override the const in the impl wherever you want to perform a check. If you try to instantiate the generic impl within a context where the check would fail, the assertion fails and you get a compiler error with the message. Implementing the trait without specifying a check is also fine, it'll just do no checks.

The main issue I'm running into right now is conflicting impl definitions after removing the : TableNotEqual trait bounds. It looks like this is because the compiler does not allow implementing multiple GAT variants if they aren't distinguishable.

The end solution may require removing the Peano numbers entirely, but if we can do these negative checks at each I think you'll find they actually aren't needed. In short, if a table doesn't appear we simply don't care-- no need to generate any trait. If it does appear as the first FROM element (by itself), we can generate AppearsInFromClause<T1> for that expression. It can't conflict with anything, so no constraint check is required. When we specify a generic impl for something that can append to the FROM clause we specify its _CONSTRAINT to assert that the table's SQL_NAME does not already appear in what we're joining to.

For composite cases this obviously requires us to be more clever with the SQL_NAME field above. Each trait QuerySource impl would need a collection of all identifiers appearing in the FROM clause (const SQL_NAMES: &'static[&'static str]; for the sake of argument), and appending would check the new table's identifier against the entire collection. Then the newly generated trait impl is generated with its own SQL_NAMES, a concatenation of the existing with the new one.

The best thing about this approach is it would avoid the cost of generating a bunch of traits for non-behavior. We wouldn't need a trait for "this table might appear in this query but hasn't yet".

weiznich commented 1 week ago

I must admit that I still do not see how you can go from this:

const _CONSTRAINT: () = assert!(
   !are_equal(T1::SQL_NAME, T2::SQL_NAME),
   "cannot use identifier more than once in this context",
);
While playing around I'm using the table name here (#sql_name in the table! context) and just added a const SQL_NAME: &'static str; to Table. I had to write a const fn are_equal because it's not yet possible to compare two static str directly. This is pretty dumb, but appears to work:
/// We need this because equality is only implemented in Rust for primitive types.
const fn are_equal(a: &'static str, b: &'static str) -> bool {
   let a = a.as_bytes();
   let b = b.as_bytes();

   matches!(a, b)
}

to something that actually uses these constants here:

https://github.com/diesel-rs/diesel/blob/2090344bf4b7259a262ea80311ba2c33b7e66c12/diesel/src/query_source/mod.rs#L154-L160

or in any other place this check needs to happen. The main problem there is that T1 and T2 are generics on the outer item (function/traits/…) and they cannot be used on the inner item (const in this case). See this playground again for the conceptual problem.

I think the other issues around conflicting impls might actually be solvable as we already have certain patterns to workaround that (see the second generic paramater of QueryFragment for details).

jeff-hiner commented 1 week ago

Alright, so assume for the moment that we're writing some code that has to perform an inner join across two tables (via a known foreign key) and then SELECT columns from both tables. Let's call those concrete tables user and address for lack of a better idea. Right now something forces those tables to impl TableNotEqual<T1> for T2 {} because removing the individual concrete impls and trying to build the above query will fail to compile. Specifically, the compiler goes hunting for impl TableNotEqual<User> for Address. The impl is empty, so there's no compilation.

I'm still honestly trying to wrap my head around why we need Peano numbers here, and how the traits all interlock. I was trying to remove the table equals constraint and rebuild it from scratch, but I kept running into trait issues or running out of stack.

Can you help me understand why we need to explicitly keep track of how many times a table appears? It still seems like we should be able to perform the uniqueness check at each point where a table is added to an existing query string and get rid of Peano counting altogether. It appears now we have a where bound on a trait that means exists zero times in this query object. Instead it would be a lot simpler to just say that AppearsInFromClause means exactly what it says, and it being implemented would conceptually mean what Count = Once means now. We can still check for duplicates, but the failure would happen when trying to use/generate a AppearsInFromClause impl on invalid chained types. It might even be the case that we don't need AppearsInFromClause at all, and should add the constraint somewhere like impl SelectableExpression.

It's a bit of a mental stretch, and please forgive me being loose with the syntax. But imagine if you have an existing user::table.inner_join(address::table) clause. Its output type is REALLY long. I get something like this:

query_builder::select_statement::SelectStatement<query_builder::from_clause::FromClause<query_source::joins::JoinOn<query_source::joins::Join<pg::metadata_lookup::user::table, pg::metadata_lookup::address::table, query_source::joins::Inner>, expression::grouped::Grouped<expression::operators::Eq<expression::nullable::Nullable<pg::metadata_lookup::user::columns::id>, expression::nullable::Nullable<pg::metadata_lookup::address::columns::id>>>>>>

so let's just refer to that type as InnerJoinUserAddress for now. It has const InnerJoinUserAddress::SQL_NAME: [&'str;2] = ["user", "address"]; constructed via the generic type.

InnerJoinUserAddress should implement a trait that says each table appears in the FROM clause. So impl AppearsInFromClause is valid for both tables, so that you can SELECT or add conditionals. If it's a generic impl, the trait body for each AppearsInFromClause isn't compiled until something demands it. Let's assume we're clever enough to do that impl in a generic, and we don't need to emit individual impls via a macro.

Now try to .inner_join(address::table) again and force the AppearsInFromClause impl generation by calling .first(&conn) on it. Unaliased, it must fail to compile the required trait or one of its dependencies. Maybe this failure would happen when trying to instantiate a const inside the generic that provides impl AppearsInFromClause<Address> for InnerJoin<InnerJoinUserAddress, Address>. Or maybe it's in the body of struct Innerjoin itself. The check simply has to iterate over the SQL_NAME present in InnerJoinUserAddress, and compare it with the SQL_NAME = "address" from the joined table. There's a match, so const panic and compile error.

weiznich commented 1 week ago

I'm still honestly trying to wrap my head around why we need Peano numbers here, and how the traits all interlock. I was trying to remove the table equals constraint and rebuild it from scratch, but I kept running into trait issues or running out of stack.

The Peano numbers are used to count how often a table appears in a from clause. That boils down to the following three variants in the end:

None
Once
MoreThanOnce

Only the Once variant is valid as a table can only appear exactly once in the from clause. If if never appears, but is referenced from somewhere the query is invalid. The same is true if it appears more than once in an unaliased form as it's then ambiguous to which instance of the table a certain column corresponds. We must currently count as we otherwise do not have a good way to distiguish between these variants.

InnerJoinUserAddress should implement a trait that says each table appears in the FROM clause. So impl AppearsInFromClause is valid for both tables, so that you can SELECT or add conditionals. If it's a generic impl, the trait body for each AppearsInFromClause isn't compiled until something demands it. Let's assume we're clever enough to do that impl in a generic, and we don't need to emit individual impls via a macro.

Now try to .inner_join(address::table) again and force the AppearsInFromClause impl generation by calling .first(&conn) on it. Unaliased, it must fail to compile the required trait or one of its dependencies. Maybe this failure would happen when trying to instantiate a const inside the generic that provides impl AppearsInFromClause
for InnerJoin<InnerJoinUserAddress, Address>. Or maybe it's in the body of struct Innerjoin itself. The check simply has to iterate over the SQL_NAME present in InnerJoinUserAddress, and compare it with the SQL_NAME = "address" from the joined table. There's a match, so const panic and compile error.

This is all correct and fine, but I still fail to see how you would be able to write that const match + panic in a generic trait implementation by referring to the outer generic parameters. That's the issue I would like to see solved first before talking about any other details, because if we cannot solve that this approach just won't work in the end.

jeff-hiner commented 1 week ago

I'm saying avoid the inner trait entirely. Don't track the "Never" or "MoreThanOnce" cases at all. If it's "Never" just don't generate a trait. If it would be "MoreThanOnce" then const panic instead.

weiznich commented 5 days ago

That's all fine and good but again I still cannot see how you would implement that const panic thing. At some point you must use generics from some outer trait for this and that's something that's not supported by rust. Before discussing all these details I would really like to see an solution to that problem first, because otherwise all those details do not matter because it's impossible to implement.

diesel-rs / diesel