Closed blueo closed 7 months ago
@blueo Since we got 2 PRs for this module, I wonder which one should go to the finish line first and that might mean the other one will have to rebase as I also used the tests/Fake/PageFake.php
in my PR.
I think I'll update the title for this - after a bit of discussion with @ssmarco it looks like there is a better way to solve this. It looks as though the logic for passing a live object to the indexer should live with the DataObjectDocument but (unlike changing the _unserialize function) we need the context of if an object is being added or removed from an index. This is to accommodate removing deleted items - as they need to present the 'old' version of a data object where as when adding to the index, we want to ensure an item is live.
A possible fix is via the indexer eg
if ($document instanceof DataObjectDocumentInterface) {
// Making sure we get the Live version of the DataObject before indexing the document
Versioned::withVersionedMode(static function () use ($document): void {
Versioned::set_stage(Versioned::LIVE);
$dataObject = $document->getDataObject();
$liveDataObject = DataObject::get($dataObject->ClassName)->byID($dataObject->ID);
$document->setDataObject($liveDataObject);
});
}
could be added prior to adding a document in src/Service/Indexer.php
Another is to update the onAddToSearchIndexes
function which is called prior to adding a document to the index.
@blueo Have retested this again manually and working as expected.
This change tries to make a consistent way of ensuring a dataobject can only be 'Live' when adding to the index but may be in another state for when it needs to be removed. It does this by using the BEFORE_ADD event to re-fetch a dataobject in "live" mode. This means a non-live dataobject can be given to DataObjectDocument for use when removing or otherwise checking if an object should be indexed - however when adding to an index, it will always use a live object.
Key Changes
shouldIndex
has been changed to useisPublished
instead ofisLiveVersion
. This means you can pass a DataObject that is not live and checks will still allow it to be indexed if there is a 'live' version available. There is a current edge case where you could have a published object but the 'draft' version is the most recent - in this case a dataobject will be removed from the index even though it has a live version.onAddToSearchIndexes
has had logic added to theBEFORE_ADD
event to refetch a document in 'live' mode. This ensures only published content will go to an index as this event is called immediately prior toprocessNode
in the indexer.Notes
Initially I thought this would allow indexing of draft content. Queues are normally run via a "dev task" which is run in the DevelopmentAdmin context. This controller will set the reading mode to DRAFT. The upshot of this is that an Index job will unserialise a DataObject in DRAFT reading mode (saved as a Data Query param) and the subsequent
toArray
call could get draft content. This is often seen as a link indexed with?stage=Stage
parameters. However becauseshouldIndex
was checking that it was the Live version - this content would always have been the same as the published object.