Efficient generation of SQL queries

bostontrader commented 4 years ago

Hello folks,

I would like to start a discussion about a general overview about how to connect GraphQL to SQL. I'm not looking for specific code for specific products (such as Rocket, Diesel, Postgress, MySQL) but am instead interested in the general overview of how we can approach the problem of generating efficient SQL queries from Juniper's GraphQL.

After quite a bit of search I have very surprisingly found almost no information about this general topic. The best I have found is this four part series: GraphQL/MySQL/Node. Here the author talks about working with the GraphQL AST that is generated by some other product as part of generating SQL. He also talks about dealing with joins. Unfortunately I don't see any way to hook into this with Juniper.

Another clue that I have is nom-sql. Although this is a parser for SQL it necessarily has all the objects needed for an SQL AST. So it should be rather straight-forward to programmatically build an SQL query from a GraphQL query. Again, alas, I don't see anyway to hook into this with Juniper w/o doing some serious hacking.

So I'm interested in hearing from those of you who understand this code much better than I, how to approach this issue.

LegNeato commented 4 years ago

Hmmm. If you are talking about generating the correct SQL query in resolvers, there is dataloader-rs and juniper-eager-loading as two strategies to prevent inefficiencies.

If you want to translate an SQL query to a GraphQL query, you don't need juniper at all as it would just take in the translated GraphQL query.

If you want to translate the GraphQL schema to a SQL tables (similar to hasura), you can get the schema as a graphql-parser AST and translate from that.

What do you want to do?

bostontrader commented 4 years ago

@LegNeato, thanks for the insight. I have what I guess is a very common issue. I have a bunch of MySQL tables that I want to access and I want to be able to provide a GraphQL interface. I'm using the Rocket framework.

Given the GraphQL query, I want to build a single SQL query that will do what I want. I think that given access to the GraphQL AST I can do this and thus graphql-parser looks like a promising lead. I haven't studied dataloader-rs in sufficient depth to conclusively rule it out but my initial impression is that it's too complicated to figure out for my application. (Not that I cringe at figuring out complicated things, but, you know...) At this time, producing the One SQL query strikes me as the easiest path forward.

Given all this new found learning, I'm going to guess that the best way to do this would be inside the http route handler for GraphQL. An example:

#[rocket::post("/graphql", data = "<request>")]
fn post_graphql_handler(
    context: State<M::JunDatabase>,
    request: juniper_rocket::GraphQLRequest,
    schema: State<Schema>,
) -> juniper_rocket::GraphQLResponse {

    request.execute_sync(&schema, &context)
}

At this point I have my db connection available in context. I will guess that the GraphQL AST should be available from GraphQLRequest and I could therefore make my SQL here, and feed it to the GraphQL resolver via the context. This strikes me as kinda hacky and contrary to the GraphQL Way but I think it would work.

Failing that, the next place to hook in might be in the resolver itself. An example snippet:

#[juniper::graphql_object(
    Context = JunDatabase
    Scalar = juniper::DefaultScalarValue,
)]

// The root query object of the schema
impl Query {

    #[graphql(arguments(category_ids(description = "A list of relevant category ids.")))]
    fn distributions(&self, database: &JunDatabase, category_ids: Vec<i32>) -> String {
        // blah, blah, blah...
    }
}

The only entre into GraphQL here is via &self. Is there any access to the AST from here?

Thanks again for the insight.

tuco86 commented 4 years ago

I'm new to rust so this might be horrendous code, but i got a query only selecting the required fields working.

I hope this helps.

let client = ctx.db.get().await.unwrap();
let mut fields: Vec<&str> = Vec::new();
let mut field_map: HashMap<&str, usize> = HashMap::new();

let mut insert_field = |field| {
    if !field_map.contains_key(field) {
        field_map.insert(field, fields.len());
        fields.push(field);
    }
};

let look_ahead = executor.look_ahead();
for child in look_ahead.children() {
    let child_name = child.inner_name();
    if child_name != "__typename" {
        insert_field(child_name);
    }
}

let result = client
    .query(
        &format!(
            "SELECT {} FROM entry ORDER BY time DESC OFFSET $1 FETCH NEXT $2 ROWS ONLY",
            fields.join(", ")
        )[..],
        &[&((page * per_page) as i64), &(per_page as i64)],
    )
    .instrument(tracing::info_span!("query_entries"))
    .await;

I had to make a change to juniper, tho. https://github.com/graphql-rust/juniper/compare/master...tuco86:master

trevyn commented 4 years ago

@bostontrader In your last comment, you have the full query in your post_graphql_handler, so you should have all the info needed to generate your SQL query, and then couldn't you just pass it to the resolver through (a somewhat more complicated) Context?

I'm a little confused about this thread -- are you also looking for tips on how to programmatically generate a SQL query for an arbitrary nested GraphQL query, or are you more interested in the nuts and bolts of how to hook into Juniper?

For reference, I've been doing some work around having my schema source of truth in Rust structs, and then automatically generating as much as possible (SQL schema, boilerplate parts of SQL and GraphQL queries, Apollo Client frontend code) from that for simple flat queries that return these structs. I don't yet have a need to do this for more complex nested graph queries, and the SQL I need there is esoteric enough that I write it by hand anyway.

LegNeato commented 4 years ago

Generally this is called the "N+1" problem in GraphQL.

Juniper provides lookaheads and it is up to you to decide how best to implement your mitigation logic.

There are roughly two strategies folks have standardized on for GraphQL:

Dataloaders, where each user gets their own mini cache that only lasts for the request.
Eager loading, where all data for the whole query is loaded up front / at the top-level object and then passed down to the resolvers. The resolvers merely return data that has been already queried "above" them.

We have docs on dataloaders at https://graphql-rust.github.io/juniper/master/advanced/dataloaders.html and there is https://docs.rs/juniper-eager-loading/0.5.1/juniper_eager_loading/. Raw lookaheads don't have many docs right now so if you go that route please contribute some back!

jmsegrev commented 3 years ago

@bostontrader not sure if you are still interested, but this might help https://github.com/prisma/quaint

bostontrader commented 3 years ago

@jmsegrev Thanks for the tip! This looks very interesting. I'll take a closer look.

triptec commented 3 years ago

@LegNeato have you seen anything that would take a graphql query like in the example in juniper_eager_loading

query SomeQuery {
    allUsers {
        country {
            id
        }
    }
}

and turn it into

select c.id from users u
join countries c on u.country_id=c.id

and I'm mainly talking about doing the join instead of using a caching solution like a dataloader or query created by eager loading ie.

select * from users
select * from countries where id in (?, ?, ?, ?)

This is something that I try to find a solution to every time I look into graphql but I can't seem to find it.

triptec commented 3 years ago

Seems I'm looking for something like https://github.com/join-monster/join-monster for rust.

ronasuti commented 3 years ago

So I've been digging through Prisma's code, and I found Prisma's core Rust engines, which includes:

query-engine which parses a couple different formats, including GraphQL, into SQL code for all supported platforms,
migration-engine-cli which takes Prisma's model files (which look suspiciously like a cross between SQL and GraphQL schema definitions) and translates them to SQL DDL
introspection-engine which can query SQL databases to pull out their schemas

The only real reasons as to why this isn't already being used in Juniper and async-graphql servers as CRUD automation for databases is that it uses Prisma schema language (no actual GraphQL schemas) and that the query/mutation syntax is... Odd to say the least.

Still, this is really intriguing.

adzialocha commented 3 years ago

To throw in another interesting project with a similar solution into the mix: https://github.com/graphprotocol/graph-node/tree/master/graphql%2Fsrc

"The Graph" indexes and persists Ethereum data, which is a slightly different use-case, but essentially offers a dynamic GraphQL API on top, translating requests to SQL.

frederikhors commented 1 year ago

Hi guys!

I'm coming from Go where I used entgo to generate automagically structs for SQL queries using GraphQL too.

I'm wondering if there is something like that in Rust (maybe for Diesel).

Basically, given a series of structs with "one to many" or "many to many" relationships, I would like to automatically generate code that allows me to query those relationships in SQL using AND, OR, NOT filters for example.

Here is better explained: https://entgo.io/docs/tutorial-todo-gql-filter-input.

Can you help me understand more?

LegNeato commented 1 year ago

I'm familiar with Ent as I was at FB when the Ent framework was created. As far as I know nothing similar exists in Rust. I actually had a Rust prototype when I worked at Robinhood but it has been lost to the sands of time.

graphql-rust / juniper

Efficient generation of SQL queries #760