cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
4.94k stars 279 forks source link

Concrete disjunction doesn't unify #1442

Open odipar opened 2 years ago

odipar commented 2 years ago

What version of CUE are you using (cue version)? cue version v0.4.1-beta.6 darwin/amd64 (same with playground: v0.4.0)

Does this issue reproduce with the latest release? Not sure

What did you do? cue eval the following

// A and B are assumed to be 'semantically' equivalent
A: { x: number, y: (1 | 2) + x }
B: { x: number, y: 1 + x } | { x: number, y: 2 + x }

// A valid combination for x and y that unifies with A and B
C: { x: 1, y: 3 }

// Disjunction (1 | 2) isn't resolved (while x and y are concrete)?
D: A & C

// Disjunction gets resolved
E: B & C

// Disjunction isn't resolved
F: A & B

// Concrete disjunction in y keeps 'blocking' y to become concrete 
H: F & C

What did you get?

// A and B are assumed to be 'semantically' equivalent
A: {
    x: number
    y: (1 | 2) + x
}
B: {
    x: number
    y: 1 + x
} | {
    x: number
    y: 2 + x
}

// A valid combination for x and y that unifies with A and B
C: {
    x: 1
    y: 3
}

// Disjunction (1 | 2) isn't resolved (while x and y are concrete)?
D: {
    x: 1
    y: (1 | 2)+x & 3
}

// Disjunction gets resolved
E: {
    x: 1
    y: 3
}

// Disjunction isn't resolved
F: {
    x: number
    y: (1 | 2)+x & 1+x
} | {
    x: number
    y: (1 | 2)+x & 2+x
}

// Concrete disjunction in y keeps 'blocking' y to become concrete 
H: {
    x: 1
    y: (1 | 2)+x & 3 & 2+x
}

What did you expect to see?

I would expect the concrete disjunction y to unify to a concrete value

mpvl commented 2 years ago
// A and B are assumed to be 'semantically' equivalent
A: { x: number, y: (1 | 2) + x }
B: { x: number, y: 1 + x } | { x: number, y: 2 + x }

These are not semantically equivalent. Firstly, CUE does not do structural unification (like some logical programming languages), so an expression like (1 | 2) + x is inherently not concrete, as (1|2) will never resolve. Nor does CUE distribute any operators except & over disjunctions. So (1|2)+x does not get rewritten to 1+x|2+x.

Furthermore, whenever CUE sees an expression like (x + 1) & 3, it can assume the value must be 3, while treating x+1 as a constraint. These constraints are punted until later evaluation. This evaluation strategy allows cycles to be broken as in

a: b + 1
b: a - 1

This is why E resolves fine: only one of the disjuncts resolves correctly after evaluation and the other one gets eliminated.

For D, however, the expression CUE encounters is (1 | 2)+x & 3, meaning it punts (1|2)+x for later evaluation. This will later evaluate to (1|2) + 1. However, as said, as CUE does not do structural unification, and does not distribute mathematical operators over disjunctions, this will not get rewritten to 2 | 3.

For H it is really the same story, just a bit more elaborate.

What CUE probably should do here is complain that (1 | 2) + x is an unsolvable equation. In this particular case it could even be a compile error. But even if it dynamically resolves to the above expression, it should really treat this as a "permanent" error and not something that could still become complete later.

In other words, it is working as intended, but CUE could be more aggressive in 1) detecting that it is an unsolvable situation. 2) even more aggressively detect at compile time (or early evaluation) such situations that arise from merely concrete values values.

Note that in general, CUE tries to stay away as much as possible from fancy rewrites as possible as this makes the language considerably more complicated.

odipar commented 2 years ago

Thank you for looking into this issue. I'm learning more and more about CUE's semantics day by day.

And no fancy rewrites: I'm totally fine with that.

Indeed, some more aggressive reporting would keep everybody honest. Some additional documentation about which operators distribute over other operators (or not) would be welcome (only & distributes over other operators it seems).

mpvl commented 2 years ago

@odipar it is documented in the spec, but that is neither to be considered light reading nor necessarily the best way to learn a language.

We do plan to write something like "Effective CUE", akin to "Effective Go" where such things are expressed in a more accessible format.

myitcv commented 2 years ago

Linking #1013 for the "Effective CUE" reference.