Open ericclemmons opened 7 years ago
Similarly, if you are able to generate the full response higher-up the graph, it is not currently possible to prevent resolve being called on all the child objects. Let's say we have a Schema such as this:
type Star {
id: Int
name: String
}
type Galaxy {
name: String
stars: [Star]
}
type Query {
milkyWay: Galaxy
}
schema {
query: Query
}
And the following query:
query {
milkyWay {
stars {
name
}
}
}
Let's assume the stars
field has the ability to make a single request such as select star.name from stars inner join galaxy on star.id = galaxy.star_id where galaxy.name = 'Milky Way'
and the result is already in the format of being an array of {name}
objects. In this case, we do not want to iterate over every star - it's already populated in the correct format.
Having the ability for a resolver to do a 'full resolution' could be a huge performance gain in certain situations such as this, since the time to recursively iterate over a large data-set can be significant.
Any sense of what part of the course of execution is spending the most time for your use case? For example, are you validating the query first and spending time there? Is it all in query execution? Are there particular fields that are spending longer time than others? It would be interesting to see if you could put timing statements in to gain a better understanding of where the slow down is coming from.
@leebyron Sorry I didn't see this until now.
The delay in response time (2-4s, as seen in the 1st GIF) points entirely to the pruning/reformatting of the results to match the requested property structure.
When I bypass that part, I can establish a reliable baseline for the query/validation/etc., which is < 100ms.
What I was wondering is if, for large arrays, if there is duplicated validation or formatting that's performed on each item that could be done once?
(Say that there was a performance opportunity where GraphQL checks each item in array that the requested property dateCreated
exists on the parent typePost
. This could be done once for the first node, and cached for subsequent nodes in the array for O(1) vs. O(n) or whatever).
@leebyron I am experiencing similar issues when I have a list of items with their own resolvers.
My resolvers make heavy use of dataloaders and I have initially thought that there is something slow between load -> batch -> return results from in DataLoaders but then I have realized that the individual calls to resolver function of each field takes too long for some reason.
Example:
parentObject
.children(~15 of them)
resolver1
resolver2
resolver3
resolver4
resolvers return promises that resolve to either scalars or objects.
I have logged the process.hrtime results for each resolver call. What I observe is that each call to resolvers easily take 1-2 ms if not more. If I have 15 child nodes, it easily adds up to 70-80ms of just function calls. When I run node --inspect and tried to profile the most time seems to be spend on validate/visitUsingRules functions.
Here is a screenshot of Chrome DevTool profiler:
I hope this helps.
As someone who is experiencing similiar issues I'd like to provide my profiler. I have a simple query like so
query ExampleQuery { fetchPeople { ... } fetchProjects { ... } fetchOrders { ... } ... ~10 more fetches }
Most of the data fetched is within 5-20 results. However one of the data points fetched is 1200 results. I ran a profiler and seems like a significant portion relates to graphql.execute.js. I used ab testing and had somewhere around 1.5-2.5 seconds for my requests even though I'm seeing the longest individual fetch taking around 500ms. How can I make this more performant?
Statistical profiling result from isolate-0x102004600-v8.log, (12704 ticks, 1351 unaccounted, 0 excluded).
Originally posted here:
I'm trying to resolve some performance issues with large documents, and the problem (AFAICT) is due to the pruning of the document based on requested fields.
Here's how I discovered it:
2s - 4s response time.
Even if I use
formatResponse(response) { return [] }
to pretend nothing came back , it's still a problem somewhere beforeformatResponse
.106ms response time.
In the resolver, If I do:
And then use
formatResponse
to do:I can see the response starting & streaming much faster.
A co-worker tried
master
to see if #710 resolves it, but it does not appear so.For reference on why we have a document this large, it's because, internally, we leverage GraphQL to fetch a full document that we then reduce into a separate document that describes the state of key entities for internal tooling.
(For example, "why does this product not appear for users in Texas?")
Because of the complex (programmatic) rules that run against these documents, we're showing internal users the unfiltered document and filtering using the same logic that happens in user-land.
In the short-term, it appears our best option is to find a means of returning an unfiltered document (for performance) for internal uses?