Closed ssmarco closed 7 months ago
Thanks, @ssmarco!
Adding a note for when we implement a fix for this:
SiteTree
has a config enforce_strict_hierarchy
, which is what controls the unpublishing action on the child pages when a parent is unpublished. We should check that this is still set to true
before we purge any child pages during this parent page unpublishing action.
Can confirm that this is an issue - though probably one that should be fixed in framework or versioned instead.
I think this module is doing the right thing by performing the "removeFromIndexes" behaviour in its onAfterUnpublish()
extension hook.
The problem is really that child pages which get unpublished don't get that extension hook called on them, which is most likely down to the convoluted way cascading publish/unpublish events end up getting performed.
Thanks @GuySartorelli
Agree with your point. I did some investigation as well during the time of this PR, and I thought that the approach of cascade unpublishing of children pages was for performance reasons hence onAfterUnPublish()
was not called on each child page.
The content of the index should basically contain the same pages as a public user can see/navigate to, so we should keep them in sync. I guess during query time there's a canView check done for each result so we won't present unwanted content, but it'd be best if it even wasn't in the index in the first place.
I guess during query time there's a canView check done
It depends which search implementation you're using. If you're using (say) the JS libraries supplied by Elastic, then your client side is communicating directly with your Elastic engine, so there's no canView
(or any other Silverstripe framework) interaction going on.
We had a similar issue to this with the static publisher module (child pages not having their cache purged if the parent is unpublished). The solution there was to fetch and purge child pages as part of the un-publish that was actioned on the parent. Noting again that this only happens if enforce_strict_hierarchy
is set to true
.
Even if you do the search request from PHP and then perform a canView()
check before sending results to the frontend, you'll end up with the same problems with pagination that you get elsewhere with Silverstripe CMS (i.e. your page 1 shows results "1-10 of 20", but only 3 pages can be seen because the others fail permission checks)
I did some investigation as well during the time of this PR, and I thought that the approach of cascade unpublishing of children pages was for performance reasons hence
onAfterUnPublish()
was not called on each child page.
That's probably accurate, yeah. It'd probably be worth checking what sort of performance hit you'd be taking by calling the hook on each child, but I can't imagine it'd be too bad in most cases.
In the CMS, unpublishing the parent page also unpublishes the children automatically. However, the same behaviour does not seem to sync when it comes to removing them from Elastic. In this situation, the search results from Elastic will include the supposedly unpublished content. This looks like it affects version 2 and 3 but not tried on v1
Here are steps below to replicate this issue.