DependencyNotifyingDomFacade is not returning all nodes

JoernT commented 2 years ago

DependencyNotifyingDomFade should report all dependent nodes for a given context and expression but fails to report the @selected node in the following example:

 <fx-fore>
        <fx-model>
            <fx-instance src="facets.xml">
               <facets>
                   <facet dimension="type" count="25" selected="false">poem</facet>
                   <facet dimension="type" count="21" selected="false">prose</facet>
               </facets>
           </fx-instance>
           <fx-instance id="vars">
              <data>
                   <running>false</running>
              </data>
          </fx-instance>

            <fx-bind ref="facet" relevant="instance('vars')/running != 'true' or @selected='true'">
            </fx-bind>
    </fx-model>
</fx-fore>

Thus the dependencyGraph is missing nodes and will create a vertex for each 'facet' node and a node for the 'running' node but does not report the '@selected' node.

JoernT commented 2 years ago

problem here seems to be the 'or' in the expression - when changing to 'and' the additional '@selected' nodes will be added

JoernT commented 2 years ago

This hits a problem with the approach. To get referenced nodes we have to evaluate the expr though we would rather want a static analysis. So we get the cost of building the graph even before a recalculation happens.

DrRataplan commented 2 years ago

I have tried something out with using more static analysis on XPaths. That would at least give us fewer moments where we need to recalculate the dependencies. Instead of always having to recalculate (assuming the dependencies have been touched), we'll only have to recalculate when the document structure is changed. This should shave off a tremendous amount of time.

Feel free to check out my small spike at https://github.com/DrRataplan/xpath-dep-scan.

I'll spend some time soon to see how we can improve the data model and the granularity. Thinking of something like

{
  type: "attribute",
  localName: "attrName",
  namespaceURI: null,
  ownerElement: ...
} | 
{
  type: "data",
  element: ...
}

And I also need to spend some time on branches within predicates. If we're going to allow @first or @second, @*[self::first or self::second] is just waiting to throw a tantrum.

JoernT commented 2 years ago

the static analysis is needed during rebuild - just creating the recalcuation main graph. Ideally without the need to evaluate at that stage. For that ideally we would get all referenced nodes in an expression regardless of where they occur (wether inside a function, an 'if' or predicate) - not sure if that's possible.

if a one-time evaluation is needed we can probably skip or use the results in first recalculate? (just a thought)

During recalculate there are 2 situations:

we have no changed nodes so the full main graph needs to be recalculated
we got changed nodes (already have that list) and will create a subgraph of the maingraph containing all changed nodes plus their dependencies which in turn are re-calculated. This makes sure that only 'touched' branches of the data are recalulated. During subgraph creation we can make sure that nodes are only added once to the graph (in case of overlapping graphs).

I'll spend some time soon to see how we can improve the data model and the granularity. Thinking of something like

{ type: "attribute", localName: "attrName", namespaceURI: null, ownerElement: ... } | { type: "data", element: ... }

What you are referring to exactly with "data model and granularity" ?

JoernT commented 2 years ago

ah - will have to test your code - thks for that. See some mods and test looks good. Will check it out.

DrRataplan commented 2 years ago

the static analysis is needed during rebuild - just creating the recalculation main graph. Ideally without the need to evaluate at that stage. For that ideally we would get all referenced nodes in an expression regardless of where they occur (whether inside a function, an 'if' or predicate) - not sure if that's possible.

That is possible. I did something with extracting all sub-paths from the expression, evaluating them one by one to get hold of the instances of the nodes the expression may reference. There are still some caveats, like FLWOR expressions which may start binding different contexts. Those are more difficult but not impossible. And they're more rare anyway.

if a one-time evaluation is needed we can probably skip or use the results in first recalculate? (just a thought)

That is a possible solution, and it is the most straight-forward one. It's the way to get the minimal dependency chain. The behaviour will be equal to the current one though; it just saves us an evaluation cycle but the dependencies will be very minimal.

Noting this down also for myself because the XForms spec is still foreign to me:

Reading up the spec (https://www.w3.org/community/xformsusers/wiki/XForms_2.0#Dependencies) this actually makes sense. Even though the or operator is not mentioned, it does mention this: Element capital is not considered to be referenced by this expression. Although a test for capital appears in the expression, the evaluation of the expression did not proceed due to the rejection of country by the filter expression.. I'd argue that the or expression holds the same value as a filter: the @selected is not a dependency.

It is however a reference as described here: An expression references a node of instance data if the node is selected during the evaluation of the expression, even if it is subsequently excluded from further participation in the expression evaluation (for example by a filter expression). (The selection mechanism is defined in the expression module.). These references are used in the confusingly named depList in https://www.w3.org/community/xformsusers/wiki/XForms_2.0#Creating_the_Master_Dependency_Directed_Graph. To get hold of the referenced nodes we should take all the separate paths and determine which nodes they touch when they would be executed. Evaluating them is the easiest to do that :wink:. But doing it this way we can reuse the dependency graph longer: until the 'structure of the document changes': when any referenced node has a substantial change, the referenced nodes need to be recomputed. In the spec example that would happen when one changes the name attribute of the second continent: there may appear new elements in that continent element that start to match filters: they start being referenced. Meaning if any referenced node changed, the list of referenced nodes must be recomputed. But if anything unrelated changed, no need to recompute. EVEN THOUGH THE RESULT (aka the value of the node that is returned at the end) MIGHT HAVE CHANGED

During recalculate there are 2 situations:

we have no changed nodes so the full main graph needs to be recalculated

we got changed nodes (already have that list) and will create a subgraph of the maingraph containing all changed nodes plus their dependencies which in turn are re-calculated. This makes sure that only 'touched' branches of the data are recalulated. During subgraph creation we can make sure that nodes are only added once to the graph (in case of overlapping graphs).

That is clear. Having changed no nodes is supposed to be a rare event right? That is why it pays off to keep hold of that list of referenced nodes.

I'll spend some time soon to see how we can improve the data model and the granularity. Thinking of something like
{
  type: "attribute",
  localName: "attrName",
  namespaceURI: null,
  ownerElement: ...
} |
{
  type: "data",
  element: ...
}
What you are referring to exactly with "data model and granularity" ?

I was thinking of making the dependency list granular: hold absent attributes in it and making a difference between depending on an elements attributes, child list or just data. But reading the spec more thoroughly I see this is not needed at all. Just a Set of Nodes is fine.

JoernT commented 2 years ago

@DrRataplan i'm actually coming quite near a first more complete implementation. I've setup another array in model to hold changed ModelItems.

When it comes to recalculate and the array is non-empty it will be used to build the subgraph. Using the depGraph API its easy to get the dependants. It's just creating a subgraph that adds the list of dependants to the graph for each changed modelItem. depGraph will already ignore duplicates when using addNode().

Created a simple demo to ensure only the touched subgraph would be used without actually switching the processing but i guess it's just another hour or two to get it going.

Jinntec / Fore

DependencyNotifyingDomFacade is not returning all nodes #82

Noting this down also for myself because the XForms spec is still foreign to me: