graphile / crystal

🔮 Graphile's Crystal Monorepo; home to Grafast, PostGraphile, pg-introspection, pg-sql2 and much more!
https://graphile.org/
Other
12.61k stars 571 forks source link

Batching and caching SQL queries #210

Closed fortm closed 8 years ago

fortm commented 8 years ago

For below shaped Graphql queries requesting many projects and their client and numerous tasks under each project, there is scope for batching client / task resolvers -

Original request :

{
    projects {
        id
        name
        client {
            id
            name
        }
        tasks {
            id
            name
        }
    }
}

Batchable pieces :

      client {
            id
            name
        }
       tasks {
            id
            name
        }

Ref

Using dataloader, I think it should be possible to batch together 10 point query requests to an SQL IN clause :

select from client where id = 1; select from client where id = 2;

to

select * from client where id in (1, 2);

So ,I wanted to understand about how much are we batching for resolvers and at what level do we have resolvers to SQL queries. Is it mostly at leaf level or at nodes higher above in hierarchy, example in above case at project level instead of client / task ?

Can we also do caching in nginx / redis if same request comes more than once and cache is not yet invalidated.

Lastly had a doubt if we request like above for "id" and "name" of client, does select query which is generated include other fields like - { select * } or is it { select id, name } only.

Thanks

calebmer commented 8 years ago

We already batch a couple of common operations like single selects (so the select * from client where id in (1, 2); case, see PgCollectionKey) and inserts. However, we can always do better. For example, procedure calls are not batched (which they really should/could be), and we don’t batch the selection of connections for some obvious difficulties. Ultimately I treat this issue as a highest value basis. If we run into serious performance problems by not batching a certain kind of query, then that should be fixed. Otherwise it’s just premature optimization. There can also be arguments that too much batching is bad for performance, another concern we keep in mind. Whenever we find valuable areas to improve performance characteristics the entire ecosystem can benefit :+1:

As another note, on my todo list for a while has been selecting specific fields like id, name over *. Currently that is a tough thing to do because of how procedures are handled, but definitely possible.

I’m going to close this for now, this really needs to be an ongoing community discussion and whenever we find areas where we can make significant gains we should focus on them at an individual level :+1:

pencilcheck commented 7 years ago

It seems like postgraphql doesn't support query batching.

Here is a use case which I think would be awesome to have: http://dev.apollodata.com/core/network.html#query-batching

I'm using apollo and they have this option to turn on query batching, if postgraphql can support this that would be awesome!

benjie commented 7 years ago

Hi @pencilcheck; the batching you're talking about is different from that talked about in this issue (which relates to minimising the number of SQL queries; an issue solved in v4: #506).

I definitely plan to support GraphQL query batching in the future - please feel free to open an issue requesting that feature.

zopf commented 6 years ago

Note: I've submitted a feature request for GraphQL query batching here: https://github.com/postgraphql/postgraphql/issues/634

benjie commented 6 years ago

Note we have query batching now. And separate from that we also have massively improved performance 🙌