Comcast / rulio

Rulio
Apache License 2.0
336 stars 59 forks source link

Default binding syntax: a path to optional fields #68

Closed decibelcooper closed 3 years ago

decibelcooper commented 3 years ago

My team often comes across the scenario where we receive an event, or even a fact to pattern match against, that has optional, omitempty fields. In these cases, we are forced to bind the parent map and dive into it in a javascript code block. This often becomes very cumbersome and greatly increases the rule configuration complexity.

We would like to propose a syntax for bindings in patterns for specifying default values to fall back to in the event that the field is missing, and we seek feedback from @jsccast. This is work that we can carry out, provided that you agree with this kind of approach, and you do not see a major technical barrier to implementing this.

Default binding syntax

The idea here is to introduce another special character that signals the start of a default JSON value: |.

example:

{
    "pattern": {
        "data": {
            "code": "?code|\"ABC\"",
            "amount": "?amount|0"
        }
    }
}

In the above example, if code is omitted, the binding will be a string with the value "ABC". If amount is omitted, the binding will be a number with the value 0.

Default value propagation

Additionally, we propose that if data is omitted in example 2, the pattern will still match with the default bindings for code and amount, since they both have defaults. If either code or amount did not have a default, then a missing data field would result in no match.

jsccast commented 3 years ago

Re default binding syntax

Sounds excellent.

Some background: Comcast/sheens uses very similar match code, which supports optional variables, which are denoted with an extra leading ? (as in ??maybe). This code was originally copied from Rulio's match.go. Test cases serve as some documentation.

What would you think about using = instead of | to convey more of a default value rather than an alternative? Obviously not a big deal.

Re default value propagation

I'm a little worried about possible ambiguity or at least user confusion there. I wonder if we'd want to introduce yet more magic syntax to indicate that a literal could somehow be optional. I guess ¿data would be too weird! Thoughts?

decibelcooper commented 3 years ago

What would you think about using = instead of | to convey more of a default value rather than an alternative? Obviously not a big deal.

Yes, I agree that = could convey a more appropriate meaning. This is what I will plan on.

I'm a little worried about possible ambiguity or at least user confusion there. I wonder if we'd want to introduce yet more magic syntax to indicate that a literal could somehow be optional. I guess ¿data would be too weird! Thoughts?

First off, I'll say that I don't believe there is ambiguity. Certainly though, I am wary of user confusion as well. We had also thought about explicit defaults for maps in various scenarios, but in contrast to the syntax you linked in the Comcast/sheens repo, we were thinking along the lines of "assigning defaults leads to optionality", with the thought that it should be known a-priori (at the time of writing the rule) what bindings will be available downstream. My point is that if any field is optional without defining a default, the only possibility is to either not bind it at all, or choose some type-specific defaults for everything.

To your point (I think), while bearing in mind the "defaults for optionality" approach, we had also tossed around something like the following...

{
    "pattern": {
        "data={\"code\":\"ABC\",\"amount\":0}": {
            "code": "?code",
            "amount=0": "?amount"
        }
    }
}

The above syntax would suggest that if the data field is missing, then it should be assigned a particular value before pattern matching. The same goes for the amount field. This approach could be a more powerful and more explicit way of defining "defaults for optionality". E.g., it allows the user to specify that if the data field exists, the code field is required. However, the combination of redundant (and therefore less maintainable) default information and the added complexity (in my opinion) makes this slightly less desirable to me.

All that said, the core of what we want is just to have a more convenient approach to accommodating optional fields. We would be happy to go in whatever direction you think is best.

jsccast commented 3 years ago

Yeah, I agree that that redundant data approach isn't ideal. (I also don't like encoding data as a JSON string, but I suppose that's another topic.)

Warning: Mostly just rambling thoughts in this comment.

Re your original second proposal: Makes me a little nervous for the code to have to use non-local information to determine what to do. In particular, the code would have to walk the value to see if that value is eligible to be used for a match when the key (data) is missing.

Currently, the rule (for a map pattern and map message is):

If the pattern contains a literal key, then the message must also have that key, and the message's value at that key is matched against the pattern's value at that key.

If the pattern contains a variable key (e.g., ?foo), then any message key is considered as a match for that pattern entry. Note that currently a pattern can only have a single variable key in a map.

I tried to draft the amendment for your proposal, but I got stuck a little.

What's confusing me here is my desire to annotate a map key-value entry as opposed to a value. Hence my wondering about optional/default key literal like ¿data or data? or whatever. Somewhere I have notes for the pattern matching type system ...

Speaking of types, your proposal I think would amount to a sort of type inference: If the value has some property (like all map values having defaults?), then that key-value entry is itself optional. Rather than do that type inference, I was thinking of tagging the key (¿data or data?) to indicate the type and behavior of the entry. An alternative is to be explicit about the type of the value. Example: {"%type":"value","%value":{...},"%optional":true} (!). Often liberating to forget about surface syntax for a moment. Offering explicit metadata like that would also provide a mechanism that avoids JSON strings for default values ({"%type":"variable","%name":"code","%default":"ABC"}).

Perhaps what's missing is optional literals. We have optional variables, and a pattern map key can be an optional variable. It can't be an optional literal string. But now I'm just back to ¿data or data? or whatever.

Just some food for thought. I'll follow up with a more helpful comment soon, I hope.

jsccast commented 3 years ago

FWIW I tend to lean towards pragmatism, driven by users, in these situations. Maybe do a PR that includes a switch to enable your proposals? Similar to AllowPropertyVariables? AllowDefaultValues and maybe AllowDefaultProperties? If the code can remain relatively clean, then having it both ways sound reasonable on balance.

BTW, Sheens finally packaged up matching configuration. We might want to do the same here (or even use Sheen's match package directly). Just a thought for the future.

decibelcooper commented 3 years ago

@jsccast thanks for your thoughts! I think the direction you are going is very promising. I look forward to something more, but in the meantime I may prototype a little to brainstorm and also get a head start.

Edit: oops, I wrote this not realizing there was an update with a newer comment. I will make a PR!

jsccast commented 3 years ago

Prototyping sounds like just the right thing here. Thanks in advance for any experiments in that direction.