cornucopia-rs / cornucopia

Generate type-checked Rust from your PostgreSQL.
Other
835 stars 38 forks source link

Work with both sync and async #176

Closed Virgiel closed 1 year ago

Virgiel commented 1 year ago

Currently, we generate either async or sync code. This is problematic if someone wants to use both in their project because types are declared twice and therefore are incompatible.

I propose a new codegen module architecture to support both in a single file:

pub mod types {
    // Common database declared types (Enum & composite)
    pub mod public { /* ... */ }
}
pub mod queries {
    // For each query file
    pub mod copy {
        // Common queries types (Row, Params & Borrowed)
        pub mod types { /* ... */}
        // Generated type for synchronous usage (Stmt & Query)
        pub mod sync { /* ... */ }
        // Generated type for asynchronous usage (Stmt & Query)
        pub mod tokio { /* ... */ }
    }
}
dbofmmbt commented 1 year ago

I think you can work around it by generating code twice for two different destinations, e.g.

Actually, I think this may be a good solution for this use case. What do you think?

Virgiel commented 1 year ago

The problem with this solution is that the database and row types are generated in both src/cornucopia_sync.rs and src/cornucopia_async.rs. Even though they have the same name and fields, from the point of view of the Rust type system they come from different modules and are therefore incompatible, which I would like to fix

LouisGariepy commented 1 year ago

@Virgiel I'm not sure I understand what kind of project would need to have both sync and async for all queries, with compatible types between them.

I'm not opposed, but I'd like to understand the motivation behind it.

jacobsvante commented 1 year ago

Could be a CLI in an app that doesn't require async (and would be more cleanly written without). Then the main part of the app requires async (e.g. web server).

LouisGariepy commented 1 year ago

@jacobsvante I still don't see why you would need both sync and async versions of all your queries, with common reused types. In your example, you could just generate some sync queries for the CLI, and then some async ones for the rest of the app (which is already possible).

What this issue is suggesting is to reuse the common database, row and parameter types between the sync and async versions. This is what puzzles me: why would you need to use both versions (sync and async) of the same query in the same app in a way that it matters that the type are compatible.

I'm just not convinced that the added complexity is worth it since its unclear to me what the benefits are.


I'll add some more thoughts to start some talking points

Performance

Regarding performance, our benchmarks don't take into account network latency. In "real world" usecases, its very possible that async outperforms sync on all queries, even small trivial ones because of the added concurrency (you CPU can do other stuff while it waits for the DB response to cross the network). For small scale, low-throughput applications, the performance benefits of sync are unlikely to matter in the first place. In other words, I doubt we should base this decision on performance.

Usability

Regarding UX and general usability, I agree that some applications are simpler without async, and sometimes async just doesn't make sense anyway. That's why we added sync support in the first place.

In the case where the same application has some async parts, and some sync parts. You can already solve this by, for example, separating your queries in a queries_sync and queries_async folders and generating to codegen files, one for your sync queries and one for your async queries.

So my question is: what usecase is not currently possible that this issue would enable.

jacobsvante commented 1 year ago

Thanks for your thorough reply, @LouisGariepy.

I personally don't have a use case for sharing queries/types etc currently, but I can see how the synchronous CLI might want to use the same query as the async web server app (e.g. fetch all products matching the given criteria).

Virgiel commented 1 year ago

I have a web backend where I have asynchronous endpoints and synchronous workers. I prefer to use synchronous code where I can as it is way easier to use correctly when thing get complicated.

I use some queries only in sync, some only in async and other in both. We could declare synchronicity per query or per module, but generating async and sync variant for every query is the simplest solution that work for all cases, without changing our current syntax.

Currently, because we generate the same type in a sync file and an async file, I cannot reuse certain functions between endpoints and workers. Even if the generated types are exactly the same logically, they are different from the point of view of the rust type system because they come from different modules.

I am currently revising my proposal to make it simpler and compatible with current code.

When we generate only sync or only async :

pub mod types {
    // Common database declared types (Enum & composite)
}
pub mod queries {
    pub mod module {
        // Common queries types (Row, Params & Borrowed)
        // Generated type for synchronous or asynchronous usage (Stmt & Query)
    }
}

When we generate both:

pub mod types {
    // Common database declared types (Enum & composite)
}
pub mod queries {
    pub mod module {
        // Common queries types (Row, Params & Borrowed)
        // Generated type for synchronous usage (Stmt & Query)
        pub mod sync { /* ... */ }
        // Generated type for asynchronous usage (Stmt & Query)
        pub mod tokio { /* ... */ }
    }
}
LouisGariepy commented 1 year ago

If there are only a few of such queries, then it should be fairly straightforward to implement From conversions, which would likely get optimized away.

But, if you use this pattern very extensively, or if your queries involve a lot of different types, I can see how this becomes unwieldy.

Now that I understand the reasoning (worker threads + web server), I can support the proposal. We should be very careful to make the implementation neat and tidy though, to not bloat the existing code.

I think your amendment made this a non-breaking change? So I'll remove that tag. And I'm confident we can work the PR into a merge-able state.

See you in the PR :smile:.