Closed bchu1 closed 6 months ago
Thanks for the report. Interesting that Chrome and Firefox behave differently here. Some things that would help:
for _ of
?My goal is to be able to create a minimal repro with this exact query so that we can look more closely at it.
Thanks again for the report
fedSequencingReads
fedWorkflowRunsAggregate
)Map Æ’ Map() { [native code] }
npm list react-relay --depth=0 idseq-web@ /Users/bchu/czid-web-private └── react-relay@16.0.0
We are using toPromise()
after fetchQuery()
as well, if that matters.
Update: The actual culprit query is using useQueryLoader()
.
And here's the actual file: https://github.com/chanzuckerberg/czid-web/blob/main/app/assets/src/components/views/discovery/DiscoveryViewFC.tsx (I forgot it was open source 😆)
Hmm, I'm not able to see the repository (I get a 404) perhaps it's not actually public?
Looking at the Chrome profile it seems that Relay is getting many many responses from your network implementation. Perhaps there's some kind of stream/defer in play, or perhaps something strange in the way your network layer is implemented? Can you share what you're passing as the network
config option to your Relay environment?
Is suspect it's somehow returning many many payloads for your query.
Oops, I linked the private repo, updated it to be the public one.
We shouldn't be using any @stream
s or @defer
s anywhere.
Here's where we're setting network
: https://github.com/chanzuckerberg/czid-web/blob/main/app/assets/src/relay/environment.ts#L28
Can you double check that it's not simply that your code is making a large number of requests by accident? Maybe put a console.log
in that fetch function? Looking at where that query is fetched, it's a little hard to tell how it's getting called.
Update! Looks like the culprit is not the original query... I removed another query (fedWorkflowRunsAggregate
) and now everything is working fine. Here the details on that query...
Query: https://pastebin.com/tB4M15zA Sample variables: https://pastebin.com/SJt0Mhb1 Sample response: https://pastebin.com/zd04mzTH
The response is actually not any larger than the other queries we're making... but the variables are definitely huge. I wonder if this has anything to do with it? And maybe that could explain why Firefox performs better than Chrome, if there's some difference in hashing?
Anyways, I'll update the title and previous comments to reflect the new info.
Yeah, I double checked the Network tab and there's only 3 queries being made, which is expected.
From the Chrome flame chart you shared, it looks like this _next
function is getting called many hundreds of times:
That should the result of payload response directly from the observable returned by the network layer. Maybe you could log/set a breakpoint there and see what happens on Chrome?
Normalizing that response on its own, one time, does not seem to be slow: https://github.com/captbaritone/relay-normalization-repro
I may have to get back to you after the weekend, but will try logging response
there and see what happens!
Some updates:
useQueryLoader()
and there are multiple instances of the component that consumes the query reference via usePreloadedQuery()
on the page (it's rendered in each row of a (virtualized) table).fetchQuery()
and if that fixes the issue, we'll likely move on with that to unblock the team's deadlines.More updates:
fetchQuery()
did not solve the freezing, unfortunately.fetchQuery()
now). Here's the full profile: https://drive.google.com/file/d/1E8SyB17ylQilYOCLUYbewF4xfmRjWs0z/view?usp=sharing It stayed at 100% for about 2 minutes. I see a lot of _traversSelections()
, _normalizeField()
, and _normalizeLink()
in this one.We just tried artificially limiting the response on the server to only 20 aggregate
items in the array and the freezing is gone. So I guess it does have something to do with the response (size?).
We resorted to paginating this query and the performance issues are gone now. I'll go ahead and close this since it doesn't seem like much progress is being made on finding a root cause (feel free to reopen or do whatever you need with this 😸).
FWIW, I profiled again in Firefox before the pagination fix (using fetchQuery()
) and still didn't see much CPU usage at all, compared to in Chrome where I saw a giant microtask drain operation that took 2 minutes, mostly doing those normalize
calls.
I'm using
fetchQuery()
useQueryLoader()
and have run into a case where on Chrome (surprisingly Firefox seems fine?) will completely hang and freeze the browser for like 10 seconds after the following request comes back.Query: https://pastebin.com/tB4M15zA Sample variables: https://pastebin.com/SJt0Mhb1 Sample response: https://pastebin.com/zd04mzTH
This is the flamegraph I got from Firefox's debugger:
Here's the file for the Firefox profile: Firefox 2024-03-14 13.18 profile.json