Closed Stupidism closed 1 year ago
Hey there – that is certainly not acceptable. Do you have indexes created on the ‘updatedAt’ and ‘parent’ fields on your versions collection?
We will look into this tomorrow 👍
In my local env. Yes
Let me find someone to help me figure out with stage and prod env. @gedaaaa
Hey @Stupidism — I just ran a test with 20K documents and I am able to load the query with ?draft=true
in 130ms.
I do indeed have indexes created as you do locally. I wonder if this is indeed an indexing issue and that you may have no indexes present in your production _versions
collection.
As another followup, how many documents are in. your _articles_versions
collection in production?
@jmikrut Here're the indexes in our stage environment.
"{""_id"": 1}",_id_,copy-from-prod-2023-02-07._articles_versions,4
"{""parent"": 1}",parent_1,copy-from-prod-2023-02-07._articles_versions,4
"{""version.slug"": 1}",version.slug_1,copy-from-prod-2023-02-07._articles_versions,4
"{""createdAt"": 1}",createdAt_1,copy-from-prod-2023-02-07._articles_versions,4
"{""updatedAt"": 1}",updatedAt_1,copy-from-prod-2023-02-07._articles_versions,4
"{""version._status"": 1}",version._status_1,copy-from-prod-2023-02-07._articles_versions,4
"{""autosave"": 1}",autosave_1,copy-from-prod-2023-02-07._articles_versions,4
The totalDocs of article versions is 6k+. Is it possible that this is due to lack of versions for each article?
What's interesting is it only affects the REST api. The graphql API is working as expected.
We ran some profile and got this result. It's a COLLSCAN
Here's the excuted query:
{
"aggregate": "_articles_versions",
"pipeline": [
{ "$sort": { "updatedAt": -1 } },
{
"$group": {
"_id": "$parent",
"version": { "$first": "$version" },
"updatedAt": { "$first": "$updatedAt" },
"createdAt": { "$first": "$createdAt" }
}
},
{ "$match": {} },
{ "$sort": { "createdAt": -1 } },
{ "$skip": 60 },
{ "$limit": 10 }
],
"cursor": {},
"lsid": { "id": { "$binary": "jWHgUWPbRiyh4yaZVJNYDw==", "$type": "4" } },
"$db": "copy-from-prod-2023-02-07"
}
We tried to create a compound index for parent
, updatedAt
and createdAt
. Still got no luck,
Hmm, this is very strange.
I have absolutely no idea why your GraphQL is functioning as expected but REST is not. Are you able to use a tool like MongoDB Compass to connect to your database? If so, you should try and recreate the aggregate that you are running within MongoDB Compass directly.
This also might be to do with AWS' DocumentDB. I believe that Mongo will fall back on writing to disk if it does not have an index available, or if the resulting data set is too large for it to be executed in memory. That could be why you are seeing COLLSCAN. This is definitely related to COLLSCAN—the question is - why?
I just seeded 40K documents and it still loads blazingly fast for me. Here is a screenshot of my working database's indexes within MongoDB Compass:
Can you drop that version collection's indexes and recreate them?
Yes, we're using AWS' DocumentDB. Here're the indices shown with mongodb compass
BTW, the versions page flash to blank screen once I enter it.
Oops - did not mean to close. I didn't even know I did!
Question - is there a way that you could clone your exact database, run on Atlas (or a local MongoDB Community), and then re-assess the speed here?
@Stupidism The versions screen crashing was a reported issue and has since been fixed, just letting you know!
Oops - did not mean to close. I didn't even know I did!
Question - is there a way that you could clone your exact database, run on Atlas (or a local MongoDB Community), and then re-assess the speed here?
OK, I will try today.
What's interesting is it only affects the REST api. The graphql API is working as expected.
Wrong info before. This was a screenshot from local env.
@JarrodMFlesch @jmikrut We found the root cause of the slowness here.
What's making the query super slow is the second sort aggregation. It looks like documentdb cannot utilize index for the second sort.
It might be related to this question: https://stackoverflow.com/questions/13466079/aggregate-framework-cant-use-indexes/56427875#56427875
https://github.com/payloadcms/payload/commit/34582da561945cacc53d0c0664b54e63ebc8da25
Is it possible to use some other filters to get the latest version of an article instead of using group?
For example, if we set a flag in versions table latest: true
?
Oops - did not mean to close. I didn't even know I did!
Question - is there a way that you could clone your exact database, run on Atlas (or a local MongoDB Community), and then re-assess the speed here?
We tried in local mongodb. We have about 30k articles and 60k versions. And the api is still slow in local, but much faster than stage env.
Another thing we want to ask is how to log the db queies for each api call (graphql query, api params, response data, response time and etc). We found it really hard to locate the query when we meet such a slow-query situation
@jmikrut Any thoughts about optimizing the query itself?
Hey @Stupidism — I think this is definitely the move.
For example, if we set a flag in versions table latest: true?
The team and I have been discussing a way to do exactly this, and I think this will circumvent your issue.
Give us a bit here and we will get a plan in place to make this or something similar to this happen.
Great! Waiting for your good news.
Hey @Stupidism — what version of local Mongo are you using? Do you have it installed locally on your computer, or are you using Docker or similar?
It's very strange that we can't reproduce this at all. We're looking into fixes still, but I really need to be able to reproduce your exact scenario before being confident in a fix. Still digging, but the more info you can provide about your local MongoDB, the better.
We've also recreated your exact aggregate pipeline directly in Mongo Compass, and it's still quite fast:
Notice that within Compass, we can click on the Explain
button which shows that our updatedAt
index is being used. Is yours being used?
Can you provide us with an export of your local _articles_versions
collection - the one that takes 5 seconds to run - so that we can test locally?
Here's what we've done today:
match
and second sort on createdAt: -1
All of this still runs within ~800ms. Removing limit, skip, and match will reduce the query down to ~200ms but we're still keeping the second $sort
and seeing a fast query.
We will get to the bottom of this but at this point I'd greatly appreciate a copy of your collection that is running slowly on your local computer. If you are willing to share an export of your collection, you can DM me directly on Discord - otherwise you can reach me via submitting this form on our site.
Thanks for hanging in there!
@jmikrut I think the biggest issue we're using AWS DocumentDB for production. We're trying to switch to normal mongodb on our end.
Yes the index is used for local:
But the explain command throws error in stage.
Hmm, ok - so just so I am clear, on local, you are actually seeing pretty good execution time - 785ms, right? Did you change anything? Before I believe you mentioned 5s+ even trying this locally. But this looks about similar to what I am seeing on my end when I test locally. Good that it's using the updatedAt
index.
Let me know and we can go from there. Thanks!!
@jmikrut Bad news is that's still about 5s. API response time is much larger than the db execution time.
I cannot analyze even on my local env easily unless dig into payload source code inside node_modules. This leads back to my question above.
Another thing we want to ask is how to log the db queies for each api call (graphql query, api params, response data, response time and etc). We found it really hard to locate the query when we meet such a slow-query situation
Here's what I can do to analyze it. The result is still about 5s. I cannot see the final query plan.
queryDrafts: 4.643s
aggregate: {
"_pipeline": [
{
"$sort": {
"updatedAt": -1
}
},
{
"$group": {
"_id": "$parent",
"version": {
"$first": "$version"
},
"updatedAt": {
"$first": "$updatedAt"
},
"createdAt": {
"$first": "$createdAt"
}
}
},
{
"$match": {}
}
],
"options": {
"allowDiskUse": true
}
}
aggregatePaginateOptions: {
page: 1,
sort: { createdAt: -1 },
limit: 10,
lean: true,
leanWithId: true,
useEstimatedCount: false,
pagination: true,
useCustomCountFn: undefined,
options: { limit: 10 },
useFacet: true
}
@jmikrut Is it possible to turn off the draft option in list page? I think we can tolerate the inconvenience of cannot finding correct articles with some search words instead of such a slow API response.
I successfully work this around with a patch like this:
diff --git a/node_modules/payload/dist/admin/components/views/collections/List/index.js b/node_modules/payload/dist/admin/components/views/collections/List/index.js
index 6f99e3e..66e2319 100644
--- a/node_modules/payload/dist/admin/components/views/collections/List/index.js
+++ b/node_modules/payload/dist/admin/components/views/collections/List/index.js
@@ -69,7 +69,7 @@ const ListView = (props) => {
(0, react_1.useEffect)(() => {
const params = {
depth: 0,
- draft: 'true',
+ draft: slug === 'articles' ? undefined : 'true',
page: undefined,
sort: undefined,
where: undefined,
Hey @Stupidism — few thoughts for you.
@DanRibbens just brought up to me the useFacet
property. Are you passing mongoOptions.useFacet: false
when you initialize Payload? If so, this may be how your replication is slow and ours is fast. Can you set useFacet: true
?
But another thought - if you want to disable the draft=true
argument, without overwriting the Payload core, you could write a beforeOperation
hook that does the following:
args.operation
to make sure it's the read
operationargs.req.origin
to see if the request is coming from the admin UIargs.id
to see if there is an ID (AKA this is the find
and not findByID
operation). You should only disable draft=true
if it's on the find
operation itselfIf the above is true, you could set args.draft = false
.
In the interim though, I can't figure out how to recreate this in order to fix it and unfortunately I think that means we should close this issue until we can accurately and reliably replicate the performance issue.
Let me know what you think about the above and we will continue to support accordingly!
I had a guess about useFacet as well, but isn't this not working for document db?
Hi @jmikrut, This is the url you can compare the speed:
https://seo-payload.stage.getjerry.com/api/articles?locale=en&depth=0&draft=true&page=1
https://seo-payload.stage.getjerry.com/api/articles?locale=en&depth=0&page=1
curl -o /dev/null -s -w 'Establish Connection: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n' https://seo-payload.stage.getjerry.com/api/articles\?locale\=en\&depth\=0\&page\=1
Establish Connection: 0.000000s
TTFB: 0.000000s
Total: 5.009390s
curl -o /dev/null -s -w 'Establish Connection: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n' https://seo-payload.stage.getjerry.com/api/articles\?locale\=en\&depth\=0\&draft\=true\&page\=1
Establish Connection: 0.007475s
TTFB: 21.427764s
Total: 21.427931s
Hi @jmikrut, This is the url you can compare the speed:
https://seo-payload.stage.getjerry.com/api/articles?locale=en&depth=0&draft=true&page=1
https://seo-payload.stage.getjerry.com/api/articles?locale=en&depth=0&page=1
curl -o /dev/null -s -w 'Establish Connection: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n' https://seo-payload.stage.getjerry.com/api/articles\?locale\=en\&depth\=0\&page\=1 Establish Connection: 0.000000s TTFB: 0.000000s Total: 5.009390s curl -o /dev/null -s -w 'Establish Connection: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n' https://seo-payload.stage.getjerry.com/api/articles\?locale\=en\&depth\=0\&draft\=true\&page\=1 Establish Connection: 0.007475s TTFB: 21.427764s Total: 21.427931s
Hi @jmikrut ,
My team is also having similar issues with one of our collections. With drafts disabled, we get very quick collection list load time but with drafts enabled we are having long load times.
look at args.req.origin to see if the request is coming from the admin UI
look at args.id to see if there is an ID (AKA this is the find and not findByID operation). You should only disable draft=true if it's on the find operation itself
If the above is true, you could set args.draft = false.
This works, but the solution looks like it can break with a payload update in the future and our experience can be broken at some point. If you use a similar data structure to above are you able to mock or reproduce this issue?
Has anyone tested whether https://github.com/payloadcms/payload/releases/tag/v1.15.3 improves query speeds for draft-enabled collections on DocumentDB?
Has anyone tested whether https://github.com/payloadcms/payload/releases/tag/v1.15.3 improves query speeds for draft-enabled collections on DocumentDB?
It should have the same affect on DocumentDB, since it removes the need for the aggregation step when querying drafts!
Sounds promising! Will check it tomorrow, I'll let you know.
Yes, this issue has been a thorn in our side for quite some time now but I am pretty confident that this new approach will 100% solve this issue for all Mongo-like databases from here on out.
It will be defaulted in 2.0, and we wanted to get it out now as a feature flag so you guys don't have to wait 👍 thanks to Jarrod for the quick moves here!
This issue has been automatically locked. Please open a new issue if this issue persists with any additional detail.
I inspected the slow API. If I remove the draft option. I will be very fast.
vs
And it cannot reproduce in my local machine.
And I ran the migration script as the document said. https://github.com/payloadcms/payload/blob/master/CHANGELOG.md#-versions-may-need-to-be-migrated
Originally posted by @Stupidism in https://github.com/payloadcms/payload/discussions/2217