Open klinskyc opened 3 years ago
Hi Cavan, and thanks for using GraphQL2REST.
You are correct that the current implementation of GraphQL2REST relies on a gql-generator module which generates "fully exploded" GraphQL client queries that expand all fields in all nesting levels and all possible variables. This may cause "overfetching by default", but only at the server level and not at the network level - meaning that while the server possibly requests unnecessary fields, the filters are eventually applied on the response and only the requested fields go on the wire.
In most use cases this works well enough, but I can see how for queries with many potential fields where the REST request only needs a few this can create a lot of unneeded overhead on the server. This definitely needs improvement and has been on our minds (and in the tasks backlog) from the very beginning, but it was not high priority enough to implement.
In order to adjust the resulting GraphQL queries to request only the fields explicitly requested, one needs to modify the code of the gql-generator module in /src/gqlgenerator
, which is responsible for the pre-processing step. This code is based on https://github.com/timqian/gql-generator, which reads your GraphQL schema and generates .gql files containing all "fully exploded" client operations, which are later read into memory and invoked at runtime (populated with user parameters).
As a hack or workaround, you can try to edit the resulting .gql files to have just the fields you need, so as to apply a "fixed" skeleton which always limits the fields requested from GraphQL server. Note that those files will be generated again and overwritten the next time you run the generateGqlQueryFiles()
function.
While you can try to refactor the gql-generator section in our code, I recently came across this fork of gql-generator (https://github.com/Skitionek/gql-generator-node) which creates the GraphQL queries on the fly (programmatically at run-time, rather than save them to files), and can limit the response to only user-specified fields (so the resulting client query won't be "fully exploded" anymore). Theoretically, if we embed the code from this fork of gql-generator (or import it), we can limit the GraphQL response fields in advance in the manifest file (by allowing to provide a "skeleton" object in the manifest), as well as at runtime (upon an actual API request) by creating the "adjusted" GraphQL request on the fly based on the requested fields.
Some issues though:
Exploring this solution is on my todo list, but realistically I probably won't have time to do it any time soon. Referring to your second point, if you are interested and have the time to explore that route, test the package and incorporate its code into GraphQL2REST (replacing the old gql-generator) - please let me know and I'd be happy to add your contribution into our code. This can be a major improvement, but it has to be done carefully since it modifies the core of this package.
If you're interested I will open a dedicated issue for that.
Thanks! Roy
@roy-mor This is awesome! The hack option (editing the GQL files) will work for us worst-case, but will check out gql-generator-node, and will see if I can turn that into a PR.
I really appreciate you taking the time to write up such a detailed response!
@klinskyc: I could not add your contribution because it did not have tests, usage explanation (the documentation will have to be updated as well here...), and a description of the change (why & how).
Are you using this modified branch in production? Or did you end up using the workaround? (editing the .gql files). I'm curious about your usage of graphql2rest in general.
First, thank you for this project. It works better for large GraphQL projects than SOFA.
When I initially read the docs, I assumed that passing in values to fields would adjust the GraphQL query itself, and not request the data from the server in the first place.
However, after trying it out and reviewing the code, it looks like GraphQL2Rest filters fields after the data is returned. We have some objects with a very large amount of field options (100+), but only a few are needed at any given time, so this is causing our server to load a lot of unneeded/unwanted data.
1) Is there a way to adjust the GraphQL query itself to only request the fields that are in the fields param?
2) If not, would you be able to point me to the right place in the code base to make this modification?