Open leonardost opened 1 week ago
OpenAI suggested the following labels for this issue:
Hello 👋 do we know a user with this issue? That would help the investigation since it is challenging to have hundreds/thousand of domains.
While I don't have an account with that many domains, on my end, I didn't have this problem, but I noticed their whole account loaded slow. @renancarvalho I'm going to send the details via Slack.
LE: I learned someone else will look into it. Please reach out via Slack once you see this. Thanks!
While I don't have an account with that many domains, on my end, I didn't have this problem, but I noticed their whole account loaded slow. @renancarvalho I'm going to send the details via Slack.
LE: I learned someone else will look into it. Please reach out via Slack once you see this. Thanks!
Details at p1732188303637839-slack-C07GZ2UA3TN
The page makes one API call per wpcom site, for users with hundreds of domains, that is hundreds of API calls. They did eventually all get a response. I think they were all loaded into the page and I was able to scroll down to domains beginning with Z.
Eventually I got an "Oh snap. error code 5" chrome crash. This happens faster if dev tools are open, and slower if it's not - so long as you interact with the page by e.g. scrolling slowly enough through the list for it to start populating stuff.
Following along in the chrome task manager I can see cpu use nearly constantly above 200%, memory use ranges between 2-6 gb and at the point of crash is increasing. I'm guessing the crash is due to some kind of memory pressure.
To set expectations: it's probably unlikely to be easy to find and fix the problem.
It looks like this has been happening for over a year: p1695308169001609-slack-C04H4NY6STW
Looked into this some more with @zaguiini , some things we noticed:
net::ERR_INSUFFICIENT_RESOURCES
which presumably means there wasn't enough memory free to decode it - not because of the size of the response (each is approx 2kb), but because of the number of them.Ideally it should only request information for domains in the viewport - not all of the information for all of the domains the user has. This appears to happen under @zaguiini's account, but it doesn't happen for the problematic user.
Over the next few weeks the table is being replaced by a dataview by @Automattic/nexus, so it might get resolved as a side effect. If we could figure out the problem first that would be better.
/me/purchases returns a HTTP 504 for this user - gateway timeout. This will probably cause them problems on other calypso pages too.
/me/purchases doesn't currently have pagination fbhepr%2Skers%2Sjcpbz%2Schoyvp.ncv%2Serfg%2Sjcpbz%2Qwfba%2Qraqcbvagf%2Spynff.jcpbz%2Qfgber%2Qncv%2Qraqcbvagf.cuc%3Se%3Q4174p112%23656-og It does have some performance instrumentation with statsd.
It sounds like there have been repeated problems with this user's account causing fatals in payments code: p1694033164744309-slack-C096PD42U
/me/sites returns a HTTP 504 for this user - gateway timeout. This will probably cause them problems on other calypso pages too.
/me/sites doesn't currently have pagintion fbhepr%2Skers%2Sjcpbz%2Schoyvp.ncv%2Serfg%2Sjcpbz%2Qwfba%2Qraqcbvagf%2Spynff.jcpbz%2Qwfba%2Qncv%2Qzr%2Qfvgrf%2Qraqcbvag.cuc%3Se%3Q5o812359%236-og It does have some performance instrumentation with statsd.
It seems to me that we have at least three areas to improve to get the dashboard to load.
Quick summary
Users that have many domains (over a couple hundreds I believe) can't manage their domains in Calypso because the domain list page (
/domains/manage
) doesn't load. Some users have thousands of domains due to the Google Domains Takeover initiative (pcYYhz-1ts-p2).Steps to reproduce
/domains/manage
) for a user that has many domainsWhat you expected to happen
The domain list should be loaded correctly and I should be able to manage my domains.
What actually happened
The domain list never finishes loading, and eventually sometimes the page completely breaks.
Example screenshot from a user support session:
Impact
Some (< 50%)
Available workarounds?
No and the platform is unusable
If the above answer is "Yes...", outline the workaround.
No response
Platform (Simple and/or Atomic)
Simple
Logs or notes
No response