Open bwaidelich opened 1 month ago
Imagine there are two nodes a
, b
and c
in live
(black) and user
workspace (blue):
graph TD
root --> a
root --> b
root --> c
root --> a
root --> b
root --> c
linkStyle 2,3,4 stroke:blue;
In live, someone
a
(so a copy a'
is created and the edge is moved)b
(so the edge is removed)d
(so a new node d
is added and the corresponding edge)the resulting hierarchy relations are (bold = new edges, dashed = removed edges)
graph TD
root --> a
root --> a'
root --> b
root --> c
root --> d
root --> a
root --> b
root --> c
linkStyle 0,2 stroke-dasharray: 2;
linkStyle 1,4 stroke-width: 2px;
linkStyle 5,6,7 stroke:blue;
To sync the user workspace, we first find and remove all edges that are exclusive to the user
workspace:
DELETE
FROM
cr_default_p_graph_hierarchyrelation
WHERE
(dimensionspacepointhash, parentnodeanchor, childnodeanchor, contentstreamid)
IN (
SELECT
dimensionspacepointhash, parentnodeanchor, childnodeanchor, contentstreamid
FROM
cr_default_p_graph_hierarchyrelation h_source
WHERE
h_source.contentstreamid = '<user-cs-id>'
AND NOT EXISTS (
SELECT
h_target.*
FROM
cr_default_p_graph_hierarchyrelation h_target
WHERE
h_target.contentstreamid = '<live-cs-id>'
AND h_target.dimensionspacepointhash = h_source.dimensionspacepointhash
AND h_target.parentnodeanchor = h_source.parentnodeanchor
AND h_target.childnodeanchor = h_source.childnodeanchor
)
);
and get the resulting graph:
graph TD
root --> a'
root --> c
root --> d
root --> a
root --> b
root --> c
linkStyle 3,4 stroke:blue, stroke-width: 2px, stroke-dasharray: 2;
linkStyle 5 stroke:blue;
..and then copy all edges that are exclusive to the live
workspace:
INSERT INTO cr_default_p_graph_hierarchyrelation (position, dimensionspacepointhash, parentnodeanchor, childnodeanchor, contentstreamid, subtreetags)
SELECT
position, dimensionspacepointhash, parentnodeanchor, childnodeanchor, '<user-cs-id>' contentstreaid, subtreetags
FROM
cr_default_p_graph_hierarchyrelation h_target
WHERE
h_target.contentstreamid = '<live-cs-id>'
AND NOT EXISTS (
SELECT
h_source.*
FROM
cr_default_p_graph_hierarchyrelation h_source
WHERE
h_source.contentstreamid = '<user-cs-id>'
AND h_source.dimensionspacepointhash = h_target.dimensionspacepointhash
AND h_source.parentnodeanchor = h_target.parentnodeanchor
AND h_source.childnodeanchor = h_target.childnodeanchor
);
which leads to the graph:
graph TD
root --> a'
root --> c
root --> d
root --> a'
root --> c
root --> d
a
b
linkStyle 3,5 stroke:blue, stroke-width: 2px;
linkStyle 4 stroke:blue;
TL;DR: Today we always copy all edges when bringing a workspace in sync even if it does not contain any changes. It might be a worthwhile performance improvement to only "patch" the existing content stream
Imagine the following scenario:
With this simplified graph:
(black = live content stream, blue = user content stream)
When node
a
is modified in the live workspace, a copy is created and the edges moved like this:(bold = new edges)
As a result, the user workspace is out of date and needs to be synced. Today this is done by creating a new content stream, i.e. copying all edges from the target (live) workspace:
(red = edges for the new user content stream)
In reality this means that loads of edges are now the same in all three content streams.
The resulting events are
Suggestion
Instead of starting a new content stream with the
ContentStreamWasForked
event we could publish some new event (e.g.ContentStreamWasSynced
.. that contains the newversionOfSourceContentStream
) and then "only" add/remove edges that are affected:And the resulting events:
Of course this can only work if the content stream itself does not contain any changes.
Considerations
Most/all places that currently react to
ContentStreamWasForked
events (currently that is the ContentGraph-, ContentStream and AssetUsage-projection) need to also handle the newContentStreamWasSynced
(or similar) event. Furthermore theWorkspaceWasRebased
event is no longer published in these cases – So probably the Neos content cache flusher needs to handle the new event, too.The most complex part is probably the actual performance optimization to create only the missing edges. In a first implementation we could simplify this by always removing and re-creating all edges, allowing the main performance gain to be done in a non-breaking manner
Related: #4388