Open haplo opened 5 years ago
So the problem with PagePermissionHelper.get_valid_parent_pages
lies in generating a queryset with hundreds or thousands of OR
operators. There are currently three uses for this method:
.exists()
over the queryset).count()
over the queryset)I propose getting rid of the queryset and instead generating and returning an iterator. That way code currently using exists
can just check if there is at least one element, code using count()
can just do len()
and the page chooser can cast the iterator into a list if necessary.
If someone can give me the OK for this change I will work on the implementation right away.
Hi @haplo,
Thank you for reporting, and for looking so deeply into the performance here. Anything you can do to help improve things would be greatly appreciated, thanks :)
I thought it might be helpful to first provide a little more information about the two main causes for get_valid_parent_pages()
inefficiency (in case it's of any help to you or anyone else that stumbles across this issue):
The number of potential parent pages is large and varied in type. This can usually be drastically improved by setting the parent_page_types
attribute on the model being listed. Typically, ModelAdmin
classes are used for things like events, articles, or other pages that typically live under a small number of listing pages, so setting parent_page_types
usually makes a lot of sense (and will also result in a much more useful set of choices in the ChooseParentView
when adding a new item)
The user has many separate 'add' permissions in different parts of the page tree. Sometimes this is unavoidable or difficult to change in an established project, but this can also be helped by having separate groups for users with really wide-ranging permissions, rather than adding them to lots of groups with very specific permissions in different parts of the tree. (If efficiency in get_valid_parent_pages()
can be improved anywhere, I'd suspect it would be here!)
Thank you for your response @ababic.
The parent_page_types
are set and limited, so I don't think that's the cause for the slowness. This is also supported by the profiling, as most of the time is gone in __or__
queryset chaining, which is done later in get_valid_parent_pages()
.
It's true that some of my users have a lot of add permissions on pages and the situation could probably be simplified. I'm going to profile again after removing the permissions, see how big an impact that makes.
On an unrelated note, would you get someone to review wagtail/wagtail#5311 and wagtail/wagtail#5314? Those optimizations are ready to go and would be great to have them in the next release.
@haplo. Just wanted to let you know: I raised the matter of getting your PRs reviewed in the last core team meeting, and someone will get to them soon. If you do manage to find a way to optimise this part of modeladmin, I'll glady review that for you personally.
Issue Summary
I have been profiling the Wagtail admin views because my users experience some very slow load times (10-30 seconds). I implemented some big optimizations already (see wagtail/wagtail#5311 and wagtail/wagtail#5314), and when I profiled the index view I found out that a lot of the time goes into
PagePermissionHelper.get_valid_parent_pages
because it builds a massive query with hundreds or thousands ofOR
operators.See the profiling for
WMAView.dispatch
:I propose a solution in the next message.
Steps to Reproduce
I have confirmed that this issue can be reproduced as described on a fresh Wagtail project: no, I profiled over a local Wagtail 2.4 with a snapshot of production data.
Technical details