orbeon / orbeon-forms

Orbeon Forms is an open source web forms solution. It includes an XForms engine, the Form Builder web-based form editor, and the Form Runner runtime.
http://www.orbeon.com/
GNU Lesser General Public License v2.1
518 stars 220 forks source link

Improved XForms dependency engine #1192

Open ebruchez opened 11 years ago

ebruchez commented 11 years ago

Rationale

Up until now, our XForms dependency engine works in a pull fashion. What we do is that we go through all the binds, MIPs, control bindings and values, and we ask the dependency engine: should we evaluate the associated XPath expression?

A much, much better way would be to be able to say: here is a new input in the system, such as a value entered by the user. Now, tell me everything I need to update given this new input. This is very much like spreadsheets, of course. And it is also the original intent of the XForms dependency algorithm ("Recalculation Sequence Algorithm").

The gist

The idea suggested here would be to go further, and to blur the boundaries between:

We would like a single dependency graph handling the updates across those boundaries.

Now, the question is how to achieve this. The XForms specification is in a sense very precise about model dependencies, but the other hand is very imprecise about how exactly to calculate them. For example, conditionals are a problem. Also, it doesn't take into account inter-model dependencies, or dependencies of the view on models.

One hope that we have would be to be able to use static analysis to infer enough information to be useful. What this would do is establish a kind of partial order, as well as limiting the number of expressions to evaluate.

(It is very clear that using some kind of dynamic analysis instead (or in addition) would provide more information than just plain static analysis. But it is also harder to specify and to implement. There are issues such as in which order to analyze expressions initially, and how to deal with conditionals. Also we have yet to be convinced that it brings enough benefits. But if it turns out to be, it can be done in a second phase.)

We already have a mechanism to create projections of XPath expressions. So the idea at this point is to evaluate whether we could go far enough with it.

Next steps

A clear first step is to do the following: take the collection of XPath expressions that we already analyze in the model and view, and attempt to produce an all-encompassing graph. Then we must look at what kind of graphs are produced with forms that we or our customers have, and obtain a confirmation that this is the right way to go.

If so, the next step would be to use the graph at runtime to perform updates.

Scenarios handled

TODO: computation in grid

ebruchez commented 10 years ago

A few more thoughts:

ebruchez commented 10 years ago

2014-02-19: Brainstormed circular dependencies in the context of static analysis. There is a thinking that, hopefully, we can use an algorithm that performs updates until there is convergence, in most useful scenarios.

Examples:

There is some thinking that, ideally, the system should still try to encompass models and view.

ebruchez commented 9 years ago

See also forum thread.

ebruchez commented 9 years ago

See also private discussion. Here the user wants to have a button whose visibility (or readonly-ness) depends on whether the form is valid.

ebruchez commented 8 years ago

One thing to consider too: functions which depend on the environment. Right now, e.g., a calculate with say current-dateTime() or calling a Java function is not updated "continually". Should something be done in such cases, like running such functions?

ebruchez commented 8 years ago

+1 from user to allow cycles between recalculation and revalidation.

ebruchez commented 7 years ago

2017-02-09: Did some quick thinking again with @avernet about how we could maybe do this not statically but dynamically, and handle dynamic dependencies. Not sure whether we reached a significant conclusion.

ebruchez commented 5 years ago

In Form Runner, for the form proper, we have more specific constraints:

On the other hand:

This said, under the stricter assumptions of bind and variables, determining dependencies should be much easier and not require path maps.

ebruchez commented 5 years ago

Separately, we had recently questions about having readonliness depend on validity (here and here) and readonliness depending on readonliness (here).

We could expand the evaluation order we have currently in the model with calculations to include MIPs and the use of MIP functions, especially as recalculate/revalidate is now (#1773) seen as a single operation.

So with this, a "node" can have:

In addition, a node value, which can be calculated, must precede it's determination of validity.

This yields a larger graph, which allows us to determine a more complete execution order for everything in the model.

The UI should tell the user if there is a circular dependency.

ebruchez commented 5 years ago

There is still the question of whether this should/could handle repetitions as described in this comment.

ebruchez commented 5 years ago

Specific question right now: would creating a dependency graph using variables which encompasses not only calculations but MIPs be a realistic and useful thing to do as a first step?

ebruchez commented 5 years ago

We also would like to handle section templates better, including, eventually, having a single dependency graph for the form including section templates.

But, after discussion with @avernet, implementing the above (dependency graph which would include MIPs) would still be valuable as a first step.

avernet commented 5 years ago

+1 from customer to allow the readonlyness of a control to depend on the validity of another control

ebruchez commented 4 years ago

+1 from customer to depend on relevance.

ebruchez commented 4 years ago

+1 from customer

ebruchez commented 3 years ago

Working on general form processing performance, we might want to look at a first step where we optimize the evaluation of control bindings and values, ignoring, for now, optimizations in the model. The idea is this:

Open questions:

ebruchez commented 3 years ago

BindingUpdater does some tricky things, including:

ebruchez commented 3 years ago

Quick experiment with large form:

ebruchez commented 3 years ago

When we set or refresh bindings, we don't only "set the binding". We also:

The relevance can come from:

Also:

This means that our graph should be modified:

ebruchez commented 3 years ago

A single value change in the data can cause:

In addition, of course:

ebruchez commented 3 years ago

Our approach now is to create a unified graph for all models and controls (in a given part analysis). Once we have the graph, we can decide how to use it, including whether to use it in the models or the view only; whether to support inter-model dependencies; etc.

The graph itself, once created, is immutable.

(Note that in Form Builder, the part analysis for the form being edited at this time doesn't need the graph as it doesn't perform recalculations, evaluation of MIPs, etc. At least, not in the "large". We can see later whether some optimizations are possible there as well.)

The graph has the following requirements:

We keep the model's rebuild/recalculate-revalidate cycle. There can be multiple such cycles before a UI refresh.

LIke now, we keep track of changes (as paths) that pertain to UI updates between refreshes.

ebruchez commented 3 years ago

Within models, the algorithm will be as follows:

Within the view:

ebruchez commented 3 years ago

Still struggling with representing the graph, but the idea now is to have different types of nodes:

Reminder:

ebruchez commented 3 years ago

Extension attributes on controls (which can be AVTs):

We only need to consider as properties in the dependency system those that are AVTs.

ebruchez commented 3 years ago

We need to represent the inheritance of required and readonly.

The answer we need (for now) is "given a change to this value, which controls must update their relevant or readonly property`.

(In a first step the model will just work as usual, without leveraging the dependency graph. So it will store relevant values the good old way.)

It might be that an intermediary bind doesn't have known dependencies:

<xf:bind ref="instance()" relevant="@foo = 42" id="fr-form-bind">
    <xf:bind ref=".//bar/baz" id="foo-baz-bind"/>
</xf:bind>

<xf:input bind="foo-baz-bind" id="control-via-bind"/>

<xf:input ref="instance()//bar/baz" id="control-via-ref-1"/>

<xf:input ref="instance()/bar/baz" id="control-via-ref-2"/>

foo-baz-bind and control-via-ref-1 will have unknown bind and MIP dependencies so will work.

control-via-ref-2 will have known binding dependencies. How do we make sure that we can re-evaluate its MIPs? It seems that it might not be possible without handling the notion of a descendant axis? Do we care or is this too far-fetched to handle for now?

With the current system, we refresh all controls during a refresh. With the proposed new system, we'd want to try to avoid that.

We could detect inheritance from two sources:

In this case though, control-via-ref-2 has no enclosing control, so we just can't know that it needs to refresh its MIPs.

The answer might just be to require avoiding this kind of scenarios for now. Could we detect them?

ebruchez commented 3 years ago

After discussion, the case of control-via-ref-2 can be handled by looking at paths and subpaths. That's the correct way to handle this for controls bound via ref. For controls bound via bind, we can do it the same way (inferring the path from the Node.Bind) or taking a shortcut by looking at the hierarchy of binds (if we make the assumption that bind nesting mirrors data nesting, which is good practice but not mandatory).

We also need to look at the nesting within the view.

ebruchez commented 3 years ago

xf:switch/xf:case and xf:toggle need some special handling. There are 4 cases:

fr:section: