cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.03k stars 287 forks source link

cue: Value.Expr loses default information #1710

Open myitcv opened 2 years ago

myitcv commented 2 years ago

What version of CUE are you using (cue version)?

$ cue version
v0.4.4-0.20220507115855-1ed6195cd8d0

Does this issue reproduce with the latest release?

Yes

What did you do?

#tidy
go mod tidy

# run
go run .
cmp stdout stdout.golden

-- go.mod --
module mod.com

go 1.18

require cuelang.org/go v0.4.4-0.20220507115855-1ed6195cd8d0

require (
    github.com/cockroachdb/apd/v2 v2.0.1 // indirect
    github.com/google/uuid v1.2.0 // indirect
    github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de // indirect
    github.com/pkg/errors v0.8.1 // indirect
    golang.org/x/net v0.0.0-20220425223048-2871e0cb64e4 // indirect
    golang.org/x/text v0.3.7 // indirect
    gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect
)
-- main.go --
package main

import (
    "fmt"

    "cuelang.org/go/cue"
    "cuelang.org/go/cue/cuecontext"
)

func main() {
    ctx := cuecontext.New()
    v := ctx.CompileString(`x: *3 | int`)
    x := v.LookupPath(cue.ParsePath("x"))
    fmt.Printf("x: %v\n", x)
    var i interface{}
    if err := x.Decode(&i); err != nil {
        panic(err)
    }
    fmt.Printf("i: %v\n", i)
    op, args := x.Expr()
    fmt.Printf("noop: %v, args: %v\n", op == cue.NoOp, args)
}
-- stdout.golden --
x: *3 | int
i: 3
noop: false, args: [3 int]

What did you expect to see?

A passing test (although the exact format of the output is not the point of this issue).

What did you see instead?

#tidy (0.025s)
# run (0.259s)
> go run .
[stdout]
x: *3 | int
i: 3
noop: true, args: [int]

> cmp stdout stdout.golden
--- stdout
+++ stdout.golden
@@ -1,3 +1,3 @@
 x: *3 | int
 i: 3
-noop: true, args: [int]
+noop: false, args: [3 int]

FAIL: /tmp/testscript2139768976/repro.txt/script.txt:6: stdout and stdout.golden differ

It doesn't seem right that the expression returned by cue.Value.Expr() cannot explain the concrete value we decode from the same cue.Value.

sdboyer commented 2 years ago

i had thought it was a bug initially, too. But from comments like this one, i drew the conclusion that it's more of a suboptimal compromise taken on the way to a better, eventual goal. (And the way defaults are peeled off here suggested to me it's not an accident, at least?)

I did come to the conclusion that this is actually very helpful behavior - tbqh, i've no idea how i would tease defaults out from non-default values without it. But it certainly was surprising, and i very much hope that when it's replaced with something better, i can still do what i need to do.

Figuring all this out is one of the things that led me to write an expression tree printer. i plan to turn that into a standalone lib at some point soon, because no joke, after writing against the CUE Go API for more than a year, it's the thing that finally allowed me to work confidently with complex expressions encoded within cue.Value.

sdboyer commented 2 years ago

Oh, since this issue is expressing "wat" re: Value.Expr() behavior wrt defaults - i'll also throw in a note that list defaults are 😱 😱 .

I wrote this test and very sloppy analyzer to even begin to understand the varying behaviors (when is a list's default itself, or not itself? when is it considered open? when is it bottom-Kind()-ed?), and even then, i couldn't actually intuit my way around the combinations until i grouped the inputs by their corresponding properties.

i note this here because i had no idea what to actually write down in an issue apart from "omg list exprs/defaults make my head hurt, what even makes sense?", which seems thematically in line with this issue

myitcv commented 2 years ago

To my mind, it feels like a lossy and serious continuity problem if I can't see from an Expr() result how the value I extracted via Decode() was derived (the test case in the description).

But it certainly was surprising, and i very much hope that when it's replaced with something better, i can still do what i need to do.

Please can you clarify with an example exactly what it is you are depending on, behaviour-wise?

Figuring all this out is one of the things that led me to write an expression tree printer

Feels like this is worth raising a feature request, with a link to your code as a proof of concept/prototype?

I wrote this test and very sloppy analyzer to even begin to understand the varying behaviors

I think it would be worth extracting that test to some plain CUE code which we can annotate with comments and analyse the behaviour of via cmd/cue. I think you are actually running into some of the challenges associated with lists and structs and default values, but we should also confirm you're not running into default bugs.

sdboyer commented 2 years ago

To my mind, it feels like a lossy and serious continuity problem if I can't see from an Expr() result how the value I extracted via Decode() was derived (the test case in the description).

It's possible to figure it out, but not from Expr() alone. Call(s) to Default() are also required. But i certainly agree it's surprising behavior.

Please can you clarify with an example exactly what it is you are depending on, behaviour-wise?

Apologies, that was handwavy. The particular use case i had to cover was in cuetsy where references to values marked for export as a Typescript enum (via attribute) had to be handled differently depending on whether disjuncts introduced were members of the referenced disjunct. So:

A: "foo" | "bar" @cuetsy(kind="enum") // declares an enum for export
B: {
  Afield: A | *"bar" // allowed - specifies a default that's in the enum. 
  Abroken: A | *"baz" // not allowed, because "baz" isn't part of A's declaration of the enum
  Aor: A | "baz" // allowed right now, "baz" just becomes a string literal in a union type with the exported enum
} @cuetsy(kind="interface")

This test illustrates output for similar cases that follow the rules (my test harness is still pretty crap so i'm not covering error cases well yet)

I haven't finally and firmly landed on what the right approach is here, but clearly, any decision i make is predicated on being able to make precise distinctions between where in the expression tree a value comes from, whether it's a default, and whether that node in the tree is on the other side of a reference or not. That fine control - distinguishing between "baz" and *"baz" and which field those literals are declared in - is what i want to see preserved.

I probably have some other cases i've already forgotten about, too, but i think this one case illustrates the need well enough.

Feels like this is worth raising a feature request, with a link to your code as a proof of concept/prototype?

Sure! After trying many forms of this kind of dumper/debugger, this was really the first one that actually worked, in anger, to let me learn what was going on by doing. I'd love it if a wider audience could benefit from that.

I think it would be worth extracting that test to some plain CUE code which we can annotate with comments and analyse the behaviour of via cmd/cue.

👍

I think you are actually running into some of the challenges associated with lists and structs and default values, but we should also confirm you're not running into https://github.com/cue-lang/cue/issues/1317.

Definitely would be good to tease it apart. I did encounter some of ordering issues while writing that test, but i think the committed artifact was all independent of those.

myitcv commented 2 years ago
A: "foo" | "bar" @cuetsy(kind="enum") // declares an enum for export
B: {
  Afield: A | *"bar" // allowed - specifies a default that's in the enum. 
  Abroken: A | *"baz" // not allowed, because "baz" isn't part of A's declaration of the enum
  Aor: A | "baz" // allowed right now, "baz" just becomes a string literal in a union type with the exported enum
} @cuetsy(kind="interface")

Not sure I follow the example (could well be missing something).

For Afield - if you are looking to constrain by the type A, but then specify a default value that is part of the disjunction that is A, perhaps you meant to use a conjunction on the right hand side instead of a disjunction?

Using a disjunction introduces quite different semantics to Afield.

https://cuelang.org/play/?id=vuhg1OWJXtb#cue@export@cue

All the cases that are problems in that linked example result in "bad".

Apologies if I've grabbed totally the wrong end of the stick here though.

sdboyer commented 2 years ago

For Afield - if you are looking to constrain by the type A, but then specify a default value that is part of the disjunction that is A, perhaps you meant to use a conjunction on the right hand side instead of a disjunction?

Using a disjunction introduces quite different semantics to Afield.

Hah! It totally does retain the disjunction. I may have tried that long ago, then just dismissed it out of hand as a possibility because, as your playground link notes, just naively looking at eval output of ("foo" | "bar") & *"foo" manifests conjunct'd default "foo". I never thought to check whether the disjunction remains - that "bar" is still an acceptable value.

So yes, you're right, a conjunct is the semantics i actually want there. That'll certainly help both simplify implementation and allows for a cleaner expression of author intent. 🎉 🎉

I'll have to think if i have another case that clearly illustrates my need. I suspect that when i go in and update the cuetsy implementation to accommodate this helpful guidance you've provided here, i'll encounter another example.

myitcv commented 2 years ago

It totally does retain the disjunction

This is also a large part of my thinking/mindset behind why Expr() should be faithful/precise in what it returns.

just naively looking at eval output of ("foo" | "bar") & *"foo" manifests conjunct'd default "foo"

Manifestation is literally one of the hardest things that I've come across since I started working with CUE. As @mpvl will also attest, it's also one of the biggest challenges in #822.

So yes, you're right, a conjunct is the semantics i actually want there. That'll certainly help both simplify implementation and allows for a cleaner expression of author intent. 🎉 🎉

:+1:

I'll have to think if i have another case that clearly illustrates my need. I suspect that when i go in and update the cuetsy implementation to accommodate this helpful guidance you've provided here, i'll encounter another example.

Sounds great. Really happy to explore these sorts of things as GitHub discussions/similar.