Closed adrhumphreys closed 5 years ago
Yeah, this is an inherent flaw with using the ->filterByCallback()
method on DataList, since it internally creates an ArrayList and pushes each record in the DataList if the callback returns truthy. This removes any ability to lazily execute a query (incl. pagination and any other filters applied after the callback).
You can increase your memory limit, or try and find another way to write your query - e.g. in this example using a Versioned method to generate your DataList instead
I'll look at working around it in my current use case but I think it should be something fixed in the core as opposed to people fixing the CMS filter code.
cc @silverstripe/core-team - we'd need to move filterByCallback from processing and returning an ArrayList to a DataList somehow so it stays as lazily executed when used on DataList
The above sounds like an awesome improvement my thoughts were more to just make filters such as CMSSiteTreeFilter_PublishedPages more performant by removing the filterByCallback and putting something else in place
I double checked with our beloved key project. They say they have a workaround. We probably should address this with some urgency however.
Lowering priority since this has presumably existed since the dawn of time in SilverStripe (quite likely since 2.2 hah)
we'd need to move filterByCallback from processing and returning an ArrayList to a DataList somehow so it stays as lazily executed when used on DataList
I don't really see how this could be achieved. I would say that it's probably the responsibility of the developer to be realistic with their usage of filterByCallback
. Filtering should be done by attributes as much as possible and then filtered by callback as a last resort (with the knowledge that this will fetch all the records and loop them).
Isn't this more of a docs change?
In the mentioned case (filtering by published) there's a pretty efficient way of doing this with SQL (WHERE EXISTS(SELECT * FROM table_Live l WHERE l.ID = table.ID)
off the top of my head), but there's no way we could determine what SQL to add from the given filterByCallback
call.
The problem here is that the developers (us) are using the filter by callback on the CMSSiteTreeFilter_PublishedPages
class and its not a realistic use case ;)
We should fix this instance to increase the speed of our default search and then look at the way filter by callback works elsewhere (if we do this at all). This is also especially prevalent in sites which are using Elemental as it does the same thing except over a much larger set of data.
@chillu I'd say this is a pretty high impact as it effects basic CMS quite badly. The filter by callback change would definitely be a lower priority though.
My understanding is that filtering by published pages is an advanced opt-in under "page status", so it only affects a small subset of searches performed by CMS authors. And this has been the same behaviour for nearly a decade. I agree it should be fixed, but this opt-in limits the impact.
Isn't this more of a docs change?
@ScopeyNZ that depends whether it's something that happens in core via the CMS UI or not 😛
It's code in the core class itself not developer code. It could very simply be adjusted to a raw sql condition. I suggest taking @ScopeyNZ 's suggestion and adding an inner join to the datalist against the _Live
table. :)
CMSSiteTreeFilter_PublishedPages is actually turning the datalist into an arraylist unnecessarily too. The result could remain a datalist for further filtering.
@chillu I've moved this to peer review since it's important to our beloved key project.
When you have a site with 17,000 pages (number of pages on the site I am testing, not a bottom limit) then you try to filter all the pages by just the
CMSSiteTreeFilter
e.g.CMSSiteTreeFilter_PublishedPages
it will exhaust your allocated memory limit (most likely)This is because we get every page on the site (17,000) and then apply the below code:
This means we end up with an array that slowly grows as we check if each page individually has been published.
Parent issue