cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.09k stars 290 forks source link

cmd/cue: get go should allow go structs which implement special interfaces to be force-imported #2466

Open uhthomas opened 1 year ago

uhthomas commented 1 year ago

What version of CUE are you using (cue version)?

❯ paru -Q cue
cue 0.5.0-1

Does this issue reproduce with the latest stable release?

Yes.

What did you do?

I want to import https://pkg.go.dev/github.com/VictoriaMetrics/operator/api@v0.0.0-20230626142506-8950675e8bb1/victoriametrics/v1beta1 for use as a schema definition with Kubernetes manifests, but I can't because of this rule:

Rules of Converting Go types to CUE

Go structs are converted to cue structs adhering to the following conventions:

       ...

        - a type that implements MarshalJSON, UnmarshalJSON, MarshalYAML, or
          UnmarshalYAML is translated to top (_) to indicate it may be any
          value. For some Go core types for which the implementation of these
          methods is known, like time.Time, the type may be more specific.

What did you expect to see?

I would like to generate the CUE anyway.

What did you see instead?

They implement UnmarshalJSON in a few places, which generates top values.

// VMClusterSpec defines the desired state of VMCluster
// +k8s:openapi-gen=true
#VMClusterSpec: _
uhthomas commented 1 year ago

I would also like a way to disable this completely. There are some instances where types implement custom json marshalers, but have the same API.

https://pkg.go.dev/github.com/cilium/cilium@v1.14.2/pkg/policy/api#EndpointSelector

mvdan commented 1 year ago

I would also like a way to disable this completely. There are some instances where types implement custom json marshalers, but have the same API.

Interesting - why does that happen, do you think?

Your patch in https://github.com/cue-lang/cue/pull/2642 seems like an okay fix to me. I had written that TODO as a reminder to circle back to this at some point - my understanding is that the heuristic should require both marshal and unmarshal methods, like your patch now does, but I want to check with Marcel first to see if there's any good reason for not doing that.

I'll get back to you on this. Apologies if it takes a week or more, since Marcel is currently in a bit of a focus mode to finish his v0.7 performance work, and I'd rather not distract him with entirely different issues.

uhthomas commented 1 year ago

@mvdan Sounds good, thank you for your help.

Custom JSON marshalling is sometimes used for validation or sanitising - it looks like the example I gave above wants to marshal the the embedded struct directly if it can, or does some simple transformation otherwise. Regardless, there is no clear reason to use top here.

https://github.com/cilium/cilium/blob/a67489466c715854c4c472c939d32d35df3fe748/pkg/policy/api/selector.go#L88

mvdan commented 1 year ago

That sort of makes sense, thanks.

One potential counter-argument to no longer treating "only marshaler" or "only unmarshaler" types as top is that perhaps the code really only needs to decode, and the shape of the encoded value has nothing to do with the Go type. For example, a Go type representing a user configuration provided as input, so it's only ever JSON unmarshaled, and never marshaled. And then with an UnmarshalJSON to transform the shape - for example if the user's input JSON is an array of key-values and not an object.

I think such a case is perhaps a bit strange, but cue get go treating it as top seems right. Or at least it seems like the safe thing to do, since not treating it as top means we are choosing the wrong shape.

mvdan commented 11 months ago

I briefly spoke to @mpvl and he confirmed my suspicion from the previous comment: that the heuristic defaults to treating a type as top if it has either the marshal or unmarshal method since it might only be needed for encoding in one way. I don't think we should change the heuristic there, because falling back to top is safer than falling back to assuming the shape is exactly the same.

That said, this should be the default for the heuristic, and we should support overriding that behavior somehow. That reminds me a bit of the discussion in https://github.com/cue-lang/cue/issues/2678, where the heuristic tries to guess if a field should be required or optional, but the user should be able to override the heuristic if they know better.

And also, this means that my TODO should be deleted either way :) I wrote it with the assumption that maybe we should change the heuristic's behavior, but per the above, we probably should not.

AntoineThebaud commented 11 months ago

I also have a case where I face this limitation in translating golang model to cue defs: Because some of my types define UnmarshalJSON & UnmarshalYAML to run some additional checks (no custom transformation here), I get a top-value generated instead of the full def I expect.

Is there any plan to provide a way to work around this default "cautious" behavior?