cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.13k stars 294 forks source link

Idempotent cue/spec cue/format intention of spec, impl expectations #161

Closed cueckoo closed 3 years ago

cueckoo commented 3 years ago

Originally opened by @rudolph9 in https://github.com/cuelang/cue/issues/161

Cue Idempotence

Theoretical

Are the following interpretations of cue spec correct (in a theoretical sense)?

1) Unification of a set of cuelang expressions in the same order-one time will yield the same result. 1) Unification of a set of cuelang expressions in any order-one time will yield the same result. 1) Unification of a set of cuelang expressions in the same order-any number of times will yield the same result. 1) Unification of a set of cuelang expressions in any order-any number of times will yield the same result.

1) Disjunction of a set of cuelang expressions in the same order-one time will yield the same result. 1) Disjunction of a set of cuelang expressions in any order-one time will yield the same result. 1) Disjunction of a set of cuelang expressions in the same order-any number of times will yield the same result. 1) Disjunction of a set of cuelang expressions in any order-any number of times will yield the same result.

Actual Impl

1) Is it reasonable to expect: Given []byte, an Instance produced by passing []byte to Unmarshal, then a the []byte produced by calling Marshal the produced Instance will return an identical []byte array to the original. 1) What is considered an operation? Runtime has a note that states "Any operation that involves two Values or Instances should originate from the same Runtime.". 1) Does this include Merge on two instances. 1) Suppose two instances were Marshaled then Unmarshaled does the marshal need to be from the same Runtime to Unmarshal and merge together? 1) Is accurate to think of Merge as the Unification of the emit value (both in the explicit case and implicit case of all top-level values)? 1) Is it accurate to think of a Value as what the spec refers to as a expression ? 1) Could you help me understand how Value and ast.Expr relate to one another, as far as I understand ast.Expr is the position in the file where what the spec refers to as expression is written and Value is the evaluated result but I feel like I'm missing something. 1) Given a set of []byte, the set of Instances produced by passing []byte to Unmarshal, would you expect Merge and subsequent call to Marshal always to return the same []byte:

cueckoo commented 3 years ago

Original reply by @mpvl in https://github.com/cuelang/cue/issues/161#issuecomment-544195451

Theoretical Yes, unification and disjunction are associative, commutative and idempotent, at least w.r.t. the theoretical outcome. So, the order of fields in structs may differ for different orders, as the order of fields in structs is semantically irrelevant (there is a proposal to at least guarantee topological consistency). There is a bug currently, though, where certain builtin constraints break commutativity as they are not monotonic and CUE doesn't handle this yet (this is certainly possible, though).

Actual Impl: 1) No, not in general. I assume with []byte, you mean CUE program. There are all kind of reasons why this may not be the same on a round trip. For instance, CUE is not required to preserve the order of fields (although implementing a best-effort topological sort would be nice). 2) Anything. So even when merging two Instances they should originate from the same Runtime. This restriction is an implementation detail and may be lifted. 3)Yes 4) No, Instances do not have to originate from the same runtime for them to be Unmarshaled by a a single one. 5) Sort of. Merge is like Unify, but with templates pre-expanded. IOW, it unifies the fully evaluated values (semantically, the implementation is more lazy). 6) A reasonable description. I would describe a Value as the value associated with a path in the configuration tree. It is true, though, that Value gives access to both the evaluated value as well as the raw expression. 7) ast.Expr is the result of a parse tree. It represents syntax. Value represents semantics: the value at the position in the configuration tree and is the result of partial or full evaluation. Note that this is often not true for ast.Expr. The expressions related a particular point in the configuration tree may be scattered across multiple locations in the source and may be the result of more general constraint being applied (e.g. foo <Name>: constraint. Evaluation computes the expression for a Value. The consumer of Value may analyze this as is or request the completion of the evaluation. 8) Not sure under which conditions you mean that, but generally no. But given the exact same bytes and same order, using the same binary, yes, although even then no guarantee of that can be given. I'm assuming that the bytes you are referring here are the binary encodings of a marshaled CUE representation, which (currently) are CUE programs with additional information. The data format is currently proprietary and may change over time. Even CUE programs itself may change. Even with order guarantee of fields, CUE will still have to invent variable names for variables that become shadowed as a result of computation, the algorithm of which may change over time. But the semantics of the data, at least, should not change.

cueckoo commented 3 years ago

Original reply by @rudolph9 in https://github.com/cuelang/cue/issues/161#issuecomment-544460092

There is a bug currently, though, where certain builtin constraints break commutativity as they are not monotonic and CUE doesn't handle this yet (this is certainly possible, though).

Is there there a ticket for this bug? Or, could you provide me an example of when this bug occurs?

cueckoo commented 3 years ago

Original reply by @mpvl in https://github.com/cuelang/cue/issues/161#issuecomment-560166561

Addressed by #78, for instance.

Closing, as there seems no action left associated with this.