embedding a definition at the top level seems to disregard closedness rules

mvdan commented 5 days ago

$ cue version
cue version v0.11.0-alpha.3.0.20241008133719-86fdd970dd65+dirty

go version devel go1.24-356ba0f065 2024-10-07 20:52:38 +0000
      -buildmode exe
       -compiler gc
  DefaultGODEBUG asynctimerchan=1,gotypesalias=0,httpservecontentkeepheaders=1,multipathtcp=0,randcrash=0,randseednop=0,tls3des=1,tlskyber=0,x509keypairleaf=0,x509negativeserial=1
     CGO_ENABLED 1
          GOARCH amd64
            GOOS linux
         GOAMD64 v3
             vcs git
    vcs.revision 86fdd970dd65df005c76875eb3ec9060a04428b5
        vcs.time 2024-10-08T13:37:19Z
    vcs.modified true
cue.lang.version v0.11.0

env CUE_EXPERIMENT=evalv3=0
! exec cue export input.cue

env CUE_EXPERIMENT=evalv3=1
! exec cue export input.cue

-- input.cue --
#Schema: {
    known: string
}
#Schema

{
    known: "foo"
    unknown: "bar"
}

I expect the testscript above to succeed - unknown should not be an allowed field given that #Schema is a definition, and so it is a closed struct. However, both the old and new evaluator unexpectedly succeed:

> env CUE_EXPERIMENT=evalv3=0
> ! exec cue export input.cue
[stdout]
{
    "known": "foo",
    "unknown": "bar"
}
FAIL: test.cue:2: unexpected command success
> env CUE_EXPERIMENT=evalv3=1
> ! exec cue export input.cue
[stdout]
{
    "known": "foo",
    "unknown": "bar"
}
FAIL: test.cue:5: unexpected command success

This is definitely wrong, particularly since placing the values under a field foo makes it work again:

#Schema: {
    known: string
}
foo: #Schema

foo: {
    known: "foo"
    unknown: "bar"
}

It is worth noting that the original example also unexpectedly succeeds when using a list:

#Schema: {
    known: string
}
[...#Schema]

[{
    known: "foo"
    unknown: "bar"
}]

cuematthew commented 5 days ago

I think this is working as intended.

In the struct case, you are embedding two structs (one being the #Schema definition) into the file top-level struct/value. This works fine. If you add some explicit brackets, it becomes clearer why it passes:

{
  #Schema: {
    known: string
  }
  #Schema

  known: "foo"
  unknown: "bar"
}

As normal, when embedding a definition, yes, it closes the enclosing struct, but it doesn't reject any other fields that are statically there - that's an important difference between embedding a definition, and unifying with a definition.

With the list case, I'm less sure. I think the list case is equivalent to this:

  #Schema: {
    x: string
  }

  [string]: #Schema // same as saying "every element of the list must unify with #Schema"
  known: {x: "hi", y: 7} // but this does fail

mvdan commented 5 days ago

You're completely right about the struct case, my apologies. Let's focus on the list case, because that more closely resembles what the original user was doing, I just reduced the example a bit too much. Then our main testscript is this one:

env CUE_EXPERIMENT=evalv3=0
! exec cue export input.cue

env CUE_EXPERIMENT=evalv3=1
! exec cue export input.cue

-- input.cue --
#Schema: {
    known: string
}
[...#Schema]

[{
    known: "foo"
    unknown: "bar"
}]

Which should succeed, but fails, as both the old and new evaluator accept the input. From my reading of the spec, they should fail, but I'm not sure.

cue-lang / cue

embedding a definition at the top level seems to disregard closedness rules #3487