Open robbieaverill opened 5 years ago
I'm going to try out a POC
Another option is that we could add some config to disable back link tracking and dependent pages entirely. Large data sets make the CMS completely unusable by processing all of this data on every page load (in the CMS).
This seems to be working OK: https://gist.github.com/robbieaverill/9a185078f1883daf30ff039e00c504b4 - we lose the identifier string for the type of relationship, but that could be added back in in the GridField that renders this list if we want to (by checking the class type)
ping @silverstripe/core-team for feedback on whether you think it'd be acceptable to change this in 4.3 or not
Isn't this an issue because back links aren't necessarily pages? I think that's why it's using the DependentLinkType
attribute right?
I think that that this would be okay but really, the lesson here is that our public API should have been declared to be SS_List not ArrayList. We need to be much more careful about not being overly restrictive in how we declare our public API.
This also applies to, for example, the choice of private vs protected vs public members.
So the typing hinting should be changed to be SS_List, not DataList.
Isn't this an issue because back links aren't necessarily pages? I think that's why it's using the DependentLinkType attribute right?
If that's the case then I guess we need to follow up on the todo in BackLinkTracking to implement a polymorphic many many list
I've explored re-factoring DependentPage while looking at https://github.com/silverstripe/silverstripe-cms/issues/2276#issuecomment-426144450
That was my thought at the time. We decided to put that on the back burner because of time pressure, but I think all of my point are still relevant ... except maybe the "put your head in the sand" bit.
Problems with Dependent Pages logic
I've been digging on how
SiteTreeLink
andSiteTree::DependentPage
have been implemented.
SiteTreeLink
has a PolymorphicParent
has-one relation and aLinked
has-one relation to SiteTree.
SiteTree::DependentPageloops through all the
SiteTreeLink.Linked` for the current page, then all the VirtualPage, then all the RedirectorPage ... then it sticks everything in an ArrayList.There's some somewhat weird logic there that leads to weird behavior.
- The DependentPage GridField edit links is broken because it assume every entry will be a SiteTree.
- The Same entity can appear in there multiple times when using content blocks.
- DependentPage doesn't appear to be confident its SiteTreeLinks exist because it manually loop through the list to make sure.
- Sorting the GridField is painfully slow because it's using an ArrayList ... I came across an example where this would take 2.8 minutes in our key project. There's no way to effectively sort by URL because that value is computed in the application.
Possible solutions
Disable AbsoluteLink
Whatever we do, there's no way of sorting a list of SiteTreeLink by AbsoluteLink without loading all of the Parent objects and computing their AbsoluteLink in the application layer.
Refactor SiteTreeLink
I would ditch the polymorphic relation on SiteTreeLink. Any DataObjects that wants to say that it dependents on SiteTree for something would need to add its owner page to the SiteTreeLink relation (as opposed to itself). That would include
RedirectorPage
andVirtualPage
. Might need to include some sort of qualifier column here. e.g: Redirect, Content block, Content Link, Virtual Page.SiteTree would need some sort of hook or method that can call its children to ask them if they have a dependencies on certain pages (there's already something similar there). When saving a content block, it will probably need to trigger that hook on the parent page. We would also need to cascade delete SiteTreeLink when a SiteTree is archived.
Refactor SiteTreeLink and move the polymorphic relation to a many-many relation
That's similar to the previous solution, however we track each individual object attached to our SiteTreeLink via a polymorphic many-many relation.
e.g.:
SiteTreeLink: Linked: Page A Parent: Page B DirectDependents: - Page B itself beause it has a link in its WYSIWYG to Page A. - ContentBlock B that is own by Page B because it has a WYSIWYG link to Page A.
Each possible DirectDependents is responsible for creating the SiteTreeLink if necessary and adding/removing a reference to itself from DirectDependents. We'll know a SiteTreeLink is no longer needed when all its direct dependents are gone.
This "put your head in the sand" non-solution
The two previous solutions imply some API change that will probably break semver. They also imply having some sort of migration task to update the existing data.
I don't think there's any way to keep the current functionality, make it performant and not break API.
The best thing I can think of is remove AbsoluteLink from the GridField or save it on SiteTreeLink (which would involved keeping it up-to-date somehow, which would not be very consistent).
We could add some sort of AJAX action that only fetch the link on click somehow, if we want to allow CMS authors to access the AbsoluteLink.
Hi @chillu, should @maxime-rainville's points, mainly the
The DependentPage GridField edit links is broken because it assume every entry will be a SiteTree.
be extracted into a separate issue? Or changing the type of this issue to a bug?
That specific thing is a bug causing 500 error anytime a CMS user visits the Dependent pages tab in SS4 where a page is linked from a content block (as it's trying to fetch a subsite details for blocks that are not using the subsite extension — default CWP recipe).
Not sure I came across the subsite issue before. That's going to be a different problem.
Do you want to raise a separate issue.
@michalkleiner Yep separate issue for broken edit links please
Hello @robbieaverill
Getting this issue on CWP now, was there a solution to this?
Thanks
Not as of yet, as far as I know
This got me again on a CWP project, grrr.
New issue for the subsites problem: https://github.com/silverstripe/silverstripe-cms/issues/2554
Version: SS 4.3.x-dev
The fact that
SiteTree::DependentPages()
generates an ArrayList means that it's executed whenever it runs. This is a major problem for high volume websites, particularly because this is called every time you callgetCMSFields()
on a page.I really think we need to modify this functionality so it can return a lazily executed DataList instead.