Multi-view validators - Githubissues

HotDrink / hotdrink

JavaScript MVVM library with support for multi-way dependencies and generic, rich UI behaviors.

http://hotdrink.github.io/hotdrink/

58 stars 9 forks source link

Multi-view validators #24

Open gfoust opened 12 years ago

gfoust commented 12 years ago

Special considerations arise when dealing with validators that make use of multiple values. A simple example would be two text inputs which are used to enter numbers, the first of which must be larger than the second. To clarify discussion I'll call them A and B.

As an aside here, I'll note that there are a couple of other options for dealing with this problem. For example, a constraint could be used so that when one of these was edited the other was automatically updated to enforce the relationship. Another option would be to use a precondition, which would allow the user to enter numbers which did not preserve the relationship, but would flag an error and prevent execution of commands. Validators would be used in the case when we did not want to allow values which did not preserve the relationship into the model at all.

I'll suggest three possibilities for validating such a relationship:

Each input is assigned its own validator which compares the value coming from the view with the other's value as it is stored in the model. This is easy to implement, but introduces problems when one value fails to validate.
- If A fails to validate, now the view of A is out of sync with A in the model. Now when we go to validate B, the validator is comparing it with the value in the model, but the user sees the value in the view.
- If A fails to validate, then B is edited and does validate, A needs to be re-validated because there is a possibility it is now valid. However A will not be validated again until it is re-edited.
A single validator is used which takes the value of both views, validates them, and then either sets both or rejects both.
- The validator will always compare the values as they are shown in the views, matching the user's expectations.
- If A fails to validate, then B is edited, the validator will set both A and B if they uphold the relationship.
- Updating one updates the other - meaning their priorities are linked. This is undesirable because now priority no longer reflects the order of editing.
- Specification becomes tricky; we need to specify that the validation function requires the value from another view, which means we need some reference to the view and its read function; these are not readily available. We also need to specify the order in which the values are passed to the function.
- It also gets tricky if multiple views are bound to the same variable. Say A is bound to two inputs, and someone edits B. Which view of A do we validate against?
We use a binding variable and specify the validation as a constraint.
- As above, the validator always compares the values displayed in the views.
- As above, and either both are accepted or both are rejected.
- Since view values are always inserted into a variable, priority values always reflect the order of editing.
- Specification becomes easy, since we do not need to specify the validation; it has already been defined as a constraint.
- Still works if multiple views are bound to a single variable -- they can be bound to the same binding variable. Any edits would automatically be reflected in all views, but would still only make it in to the model if it was valid.

jaakkojarvi commented 12 years ago

On Jun 13, 2012, at 12:35 AM, Gabriel Foust wrote:

Each input is assigned its own validator which compares the value coming from the view with the other's value as it is stored in the model. This is easy to implement, but introduces problems when one value fails to validate.

If A fails to validate, now the view of A is out of sync with A in the model. Now when we go to validate B, the validator is comparing it with the value in the model, but the user sees the value in the view.

If A fails to validate, then B is edited and does validate, A needs to be re-validated because there is a possibility it is now valid. However A will not be validated again until it is re-edited.

Agreed. The above choice is problematic in many ways.

A single validator is used which takes the value of both views, validates them, and then either sets both or rejects both.

The validator will always compare the values as they are shown in the views, matching the user's expectations.

If A fails to validate, then B is edited, the validator will set both A and B if they uphold the relationship.

Updating one updates the other - meaning their priorities are linked. This is undesirable because now priority no longer reflects the order of editing.

Is it undesirable? (I am not sure how priorities should behave in this kind of cases)

Specification becomes tricky; we need to specify that the validation function requires the value from another view, which means we need some reference to the view and its read function; these are not readily available. We also need to specify the order in which the values are passed to the function.

Order of which values?

It also gets tricky if multiple views are bound to the same variable. Say A is bound to two inputs, and someone edits B. Which view of A do we validate against?

Maybe instead we should think this way: if a validation function needs to validate two (or more) variable simultaneously, then there must be a single view object for those two (or more variables). The view object may then contain/hold references to several actual concrete widgets, but nevertheless, there is a single view object bound to those two (or more) variables.

In any case, we should not think that view == widget. View can be any programmatic entity that communicates with the viewmodel according to the agreed upon protocol.

We use a binding variable and specify the validation as a constraint.

Not sure what this means exactly. Do you mean that there is an "always succeeds validation" from each view to a new variable in the view model, and then validation becomes part of the viewmodel as one of its constraints?

If so, then how does one signal failed validation?

As above, the validator always compares the values displayed in the views.

As above, and either both are accepted or both are rejected.

Since view values are always inserted into a variable, priority values always reflect the order of editing.

Specification becomes easy, since we do not need to specify the validation; it has already been defined as a constraint.

Still works if multiple views are bound to a single variable -- they can be bound to the same binding variable. Any edits would automatically be reflected in all views, but would still only make it in to the model if it was valid.

gfoust commented 12 years ago

Is it undesirable? (I am not sure how priorities should behave in this kind of cases)

We had some discussion of this on issue 14. The conclusion I reached (and no one has disagreed so far) is that priorities should reflect the order of editing, not the order in which values are actually inserted into the model. So: edit → touch, ¬edit → ¬touch

Order of which values?

I just meant that if the validator takes multiple values (i.e. multiple parameters) then we need to specify the views which those values are coming from and the order in which they are passed (i.e. parameter order). Small detail.

Maybe instead we should think this way: if a validation function needs to validate two (or more) variable simultaneously, then there must be a single view object for those two (or more variables).

Yes, that would work; but it seems to introduce a lot of extra work -- in order to validate we have to create a whole new view.

We use a binding variable and specify the validation as a constraint.

Not sure what this means exactly. Do you mean that there is an "always succeeds validation" from each view to a new variable in the view model, and then validation becomes part of the viewmodel as one of its constraints?

Yes, that is what I meant.

If so, then how does one signal failed validation?

The validation method would take the "always succeeds" variable as input and would output to two variables: the validated variable and the associated error variable. The error variables may be bound to some other part of the UI to display the message.

gfoust commented 12 years ago

Here's another scenario to consider: In the hotel example we had three inputs—start date, end date, and number of nights—connected with a constraint. One of the feedback remarks we got was that we were not validating because you could enter a negative value for number of nights. So let's consider how we might validate this.

The first thing to note here is that it's not enough to simply put a validator on number of nights because it's possible that the negative value was not entered directly but came from invalid start and end dates. We could also place validators on start and end date (or use a multi-value validator as described above) so that start date must come before end date, but then we would loose some of the helpful behavior of the constraint.

To see how the validators would interfere with the constraint, consider a scenario where the form initially has start date (6/13/2012), end date (6/15/2012), and nights (2). Now say I want to set nights (3) and start date (7/1/2012). After making both of those edits, start date comes after end date and so will fail to validate. But if only we would let it into the model, the constraint would go into effect fixing the problem. We could get really complicated and let the validator know about the priority order so it could decide whether the constraint will fix the problem or not, but I think a much better solution is to do some sort of validation after the constraint has been enforced.

One way to do this would be with a precondition. A precondition does not interfere with the working of the model, it simply notes after-the-fact that there was a problem. So a precondition would allow nights to be negative but would flag it with an error.

But suppose this motel example is part of a much larger travel budget form. Suppose the number of nights is multiplied by some nightly rate to get a total cost, and that total cost is used in several other calculations in the form. If a negative number of nights is entered we get a negative total cost, which just makes the rest of the form nonsense. It seems desirable to me to be able to be able to specify that if number of nights is negative then it should not be used in calculations for the rest of the form.

In other words, I would like to be able to validate number of nights within the model - after it has been calculated by the start-date/end-date/number-of-nights constraint, but before it is used by any other methods. If validation is simply a constraint between two variables then this is easy to do.

jaakkojarvi commented 12 years ago

On Jun 13, 2012, at 4:57 PM, Gabriel Foust wrote:

Maybe instead we should think this way: if a validation function needs to validate two (or more) variable simultaneously, then there must be a single view object for those two (or more variables).

Yes, that would work; but it seems to introduce a lot of extra work -- in order to validate we have to create a whole new view.

It would be more work, but maybe it is worth while to think what this work would be. Are there commonalities, some generic behavior, structures, etc. to tease out.

We use a binding variable and specify the validation as a constraint.

Not sure what this means exactly. Do you mean that there is an "always succeeds validation" from each view to a new variable in the view model, and then validation becomes part of the viewmodel as one of its constraints?

Yes, that is what I meant.

If so, then how does one signal failed validation?

The validation method would take the "always succeeds" variable as input and would output to two variables: the validated variable and the associated error variable. The error variables may be bound to some other part of the UI to display the message.

I think the issue then becomes that there will be garbage values in the model, and the rest of the model will need to deal with those.

gfoust commented 12 years ago

It would be more work, but maybe it is worth while to think what this work would be. Are there commonalities, some generic behavior, structures, etc. to tease out.

OK, sure. So, I'll give it a go:

The tricky thing about binding to two variables is that most of the code we've written assumes there's only one value being worked with; for example, the read operation is supposed to take a view and return a value; what do we do if our view holds two values? So we'd probably have to make two separate read operations -- one for each variable -- and then subscribe them to editing events on both widgets. That means the validator function would be executed twice: once for each read. And it would still mean that when one was edited (successfully) the priority of both would be updated.

Another option (suggested by John) would be to create a third variable which holds an object constructed from the other two values (by means of a constraint). Now we can just bind to that third variable and read/write both values at the same time. Admittedly, this is not technically binding to two variables, but I think it's a much neater solution and avoids many of the issues generated by the other solution. (We still have the issue of just having a single priority for all variables.)

I guess perhaps this could be generalized into a generic composite view type which took multiple views (with their associated read/write operations) and bound to a variable which held an object. The read operation for the composite type would perform a read on the views an assemble the results into an object. The write operation would take the fields of the object and perform a write on the corresponding views.

gfoust commented 12 years ago

I think the issue then becomes that there will be garbage values in the model, and the rest of the model will need to deal with those.

OK... So, my impression is that you feel these extra variables sort of clutter up the model and get in the way. I'd like to consider that for a moment.

In terms of efficiency, I would think these extra variables and constraints are adding a minimal amount of extra work for the system. There are a few extra methods to be considered when solving, but the code in the methods would need to be executed at some point anyway. And the extra work results in improved functionality.

In terms of programmer design, I can definitely see how the extra variables are undesirable: the programmer should be able to focus on just the data model and completely ignore any details about the user interface. But we can support that through modular design. It's perfectly possible to create the core model on its own, then come back later and add any variables necessary for validation. For example, you could write a function that takes the core model as a parameter and adds to it; or perhaps even have the model be a class and then make a subclass that has the binding variables added. Even in ADAM (if we ever go back to that) we can add language support for modular design so that the underlying data model can be defined on its own.

The only other aspect I can think of is debugging. It's possible that debugging the model might be worse (since there's a little more work going on), but I don't think it would be much worse. (And we really don't have any support for debugging a model anyway.)

Are there other ways in which having extra variables is going to interfere with the model?

jaakkojarvi commented 12 years ago

On Jun 14, 2012, at 1:39 AM, Gabriel Foust wrote:

I think the issue then becomes that there will be garbage values in the model, and the rest of the model will need to deal with those.

OK... So, my impression is that you feel these extra variables sort of clutter up the model and get in the way. I'd like to consider that for a moment.

In terms of efficiency, I would think these extra variables and constraints are adding a minimal amount of extra work for the system. There are a few extra methods to be considered when solving, but the code in the methods would need to be executed at some point anyway. And the extra work results in improved functionality.

In terms of programmer design, I can definitely see how the extra variables are undesirable: the programmer should be able to focus on just the data model and completely ignore any details about the user interface. But we can support that through modular design. It's perfectly possible to create the core model on its own, then come back later and add any variables necessary for validation. For example, you could write a function that takes the core model as a parameter and adds to it; or perhaps even have the model be a class and then make a subclass that has the binding variables added. Even in ADAM (if we ever go back to that) we can add language support for modular design so that the underlying data model can be defined on its own.

The only other aspect I can think of is debugging. It's possible that debugging the model might be worse (since there's a little more work going on), but I don't think it would be much worse. (And we really don't have any support for debugging a model anyway.)

Are there other ways in which having extra variables is going to interfere with the model?

Actually, I don't really consider any of the above concerns to be real concerns :) My point is simply this: if a validator is a method in some constraint of the constraint system, then the validator is evaluated as part of the system's evaluation phase. If the validator fails, what happens to the rest of the evaluation phase? Does the evaluation proceed or is it stopped somehow? In the former case, all methods need to be able to deal with values that failed validation (this is what I meant by garbage values). In the latter case, we need to add some new mechanisms to the evaluation phase of the constraint system.

In general, I'm not opposing putting validators to the view model as a matter of principle, but I don't think we should rush into it. If we do want to put validators to the view model, the above question(s) need to be somehow resolved. And once we have done that, validators in the viewmodel might not look so clean anymore---it might be cleaner to keep them out, we don't know yet.

jaakkojarvi commented 12 years ago

I see where you are going with this. Indeed, one may easily end up duplicating the code of the methods in the code of validators, if a validator needs to predict what the model will produce to some variable if the current view value being edited would be fed into the model.

Recapitulating the two options of dealing with erroneous values:

1) let bad values in the model, and flag some bad values with precondidionts 2) do not let bad values in to the model at all

Both have their uses and should thus be supported.

I think 1) is non-problematic and we already support it. 2) is non-problematic in the very simple cases (validator depends on only one variable), but problematic in more complex situations (multi-variable, dependency on values that the model computes).

On Jun 13, 2012, at 5:20 PM, Gabriel Foust wrote:

> In other words, I would like to be able to validate number of nights within the model - after it has been calculated by the start-date/end-date/number-of-nights constraint, but before it is used by any other methods. If validation is simply a constraint between two variables then this is easy to do. I don't yet have a clear picture how this solves the complex cases of 2) Do you have a more concrete idea of the semantics of the evaluator if the constraint system has these "validator methods". One possibility of dealing with 2) might be to utilize undo (or model copying). 1) feed a value to be validated into the model 2) copy the model to a new instance 3) eval the new instance 4) eval preconditions 5) if any "hard" precondition fails, give error message and revert to the original model 6) otherwise continue with the new model > --- > > Reply to this email directly or view it on GitHub: > https://github.com/HotDrink/hotdrink/issues/24#issuecomment-6302298

thejohnfreeman commented 12 years ago

After reading this discussion, I can see a few different times at which validation is desired:

Before the value enters the model: The validator checks for violations of the most basic assumptions, i.e. the assumptions common to every method, subscriber, or other observer. Example: converting value from view type to model type (such conversion might be expensive and could fail).
Before the value is used by a method: The validator checks a method's precondition.

Back when Michael was with us, we discussed propagating these preconditions up until they could be turned into validators outside the model. That may not be a good idea because such a validator may be impossible to determine or impossible to satisfy.

Further, it may not be a good idea to put the validator "in the model" by attaching it to a variable because some method preconditions may not be needed in every evaluation. This can happen if a method does not always use a particular input or if the method is absent in the solution.

If we want to support this protection, it might be best to just attach a predicate that guards the method.

This case comes up often when discussing validation of multiple variables or computed variables.

Example: precondition for a method.
Before the value is used by a subscriber: Example: precondition for a command.

Perhaps we should support all of these uses, and they may require different facilities.

gfoust commented 12 years ago

I don't yet have a clear picture how this solves the complex cases of 2) Do you have a more concrete idea of the semantics of the evaluator if the constraint system has these "validator methods".

This does not actually address 2). The scenario I'm describing is this: We have our three variables—start-date, end-date, and number-of-nights—just as in the previous example. However, we have a second number-of-nights variable—number-of-nights-validated. We have a constraint which copies the value of number-of-nights to number-of-nights-validated, but only if number-of-nights is greater than zero; otherwise it just leaves number-of-nights-validated alone. This way start-date, end-date, and number-of-nights continue to work in the same intuitive fashion as before; however, if we come up with a negative value for number-of-nights, that value will not be propagated throughout the rest of the form.

This is similar to, but not quite the same as, the concept of breaking a variable down to an input variable and output variable with a validation constraint in between. The difference is that I am not assuming one of these is used for input only and the other is used for output only; they are two separate variables, each with their own purpose, and its up to the programmer to make sure he uses the correct one.

The point I was making earlier was that if validation was really just a helper function that created a method in the graph, we could use the same mechanism for this type of validation as for the binding validation.

thejohnfreeman commented 12 years ago

they are two separate variables, each with their own purpose, and its up to the programmer to make sure he uses the correct one.

This will lead to problems. The validator is in a constraint that writes the validated variable. Any other constraint that writes the unvalidated variable must not read the validated variable, or else a cycle appears. Thus, any other constraint that reads the validated variable must never write either variable, or else a cycle appears. I don't think that will be generally useful.

In general, I don't want to place a burden on the programmer to figure out which methods should use the unvalidated variable and which should use the validated variable.

jaakkojarvi commented 12 years ago

Regarding John's item #2 "Before the value is used by a method: The validator checks a method's precondition.":

Ways to deal with this:

1) Avoid it. This means propagating preconditions "up" to the boundary of the view and viewmodel, and hence guaranteeing that the method's preconditions will always hold.

As John said, this may not always be possible, or it may be too restrictive, or too complex (all flows that can write to the inputs of the method have to be considered)

Or maybe it is possible, but it may require duplicating a lot of the computation that the viewmodel does in the validator.

2) Loosen the preconditions, so that the method can deal with the "invalid" values. In practice, this would probably mean

2.1) throwing an exception - then the evaluation phase is aborted and probably not in a good state

2.2) making outputs of the method have some error value ("error" or blank or something like that) - then
   all methods reading from those outputs will have to be prepared to accept such error values and 
   propagate them accordingly

3) Some kind of transactions

A validator "lets values in" to the view model, computes, and if some method's preconditions (that are checked) fail, report the error to the validator of the variable being edited, and reject the transaction. If no errors, commit transaction, give new values to all (changed) variables.

I haven't thought the transaction approach very thoroughly, but maybe it is worth thinking about a bit more.