cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
4.94k stars 279 forks source link

cue: Path APIs are verbose and clunky for common scenarios #3159

Open mvdan opened 2 months ago

mvdan commented 2 months ago

Background

https://cue-review.googlesource.com/c/cue/+/7441 introduced the concept of the cue.Path type, which is made up of a sequence of selectors. Notably, a selector is typed, and can be one of multiple specific implementations, like:

New APIs were introduced that use paths, like:

func (v Value) LookupPath(p Path) Value

This is in contrast to the old and deprecated APIs such as:

func (v Value) Lookup(path ...string) Value

The commit message above explains why this API was problematic:

1) Minimize the possibility of errors when converting strings to either identifiers or quoted strings. For example, what should the Go val.Lookup("3") do? Should it be equivalent to cue.Str("3"), treating it as a label string, or cue.Index(3), treating it as a list index? 2) Extendability to support CUE's planned query extensions (https://github.com/cue-lang/cue/issues/165).

For the two reasons above, I think this was a good change to make in terms of the base API. However, it did make many common use cases significantly more verbose. For example, from a recent selection in https://review.gerrithub.io/c/cue-lang/cue/+/1194951, we can see that

doc, _ := v.Lookup("info", "title").String()

became the much longer and more complex

doc, _ := v.LookupPath(cue.MakePath(cue.Str("info"), cue.Str("title"))).String()

Alternative: cue.ParsePath

One alternative is using cue.ParsePath as follows:

doc, _ := v.LookupPath(cue.ParsePath("info.title")).String()

Which is simpler and shorter, but as a parse API, it can fail on invalid syntax, and it is noticeably more complex and expensive. I think parsing paths should be used primarily when the path is an arbitrary input string, such as given via the command line or as a URL parameter, and not when the path itself is fairly static in the code.

I also think path parsing suffers from similar pitfalls mentioned before; what should cue.ParsePath("3") do?

Proposal 1: Lookup(...Selector)

In my limited experience with the "path" APIs so far, I've come to realise that in most cases, I don't care about the path value. Nearly every time I am making a path, it only exists so I can use it immediately after and throw it away. In a few times I might make a path and reuse it multiple times, in which case calling the "make" API is fine, but that's usually not the case.

In that spirit, I suggest that we add "inline path" APIs which do the cue.MakePath logic by themselves, reusing the deprecated and hidden APIs like Value.Lookup:

func (v Value) Lookup(selectors ...Selector) Value

This way, our earlier example could be rewritten as:

doc, _ := v.Lookup(cue.Str("info"), cue.Str("title")).String()

I also imagine it could avoid "making" the path entirely, instead using the selectors directly, avoiding any allocations for the path itself.

Note that this solution still keeps the new cue.Str("info") instead of the older "info", due to the problems already listed with the original API.

Note that this solution takes ...Selector rather than Selector as a parameter, so that the example above doesn't have to do two lookups in sequence:

doc, _ := v.Lookup(cue.Str("info")).Lookup(cue.Str("title")).String()

Finally, note that MakePath would remain in place, for the case where one wants to make a path and reuse it many times with e.g. Value.LookupPath, perhaps even as a global variable. ParsePath would also remain in place.

Proposal 2: LookupStr(string)

A variation of the proposal above is to provide single-selector method shortcuts for each kind of selector, or at least for the common ones. For example, v.LookupStr("foo") would be short for v.LookupPath(cue.MakePath(cue.Str("foo"))), and our example from earlier would be:

doc, _ := v.LookupStr("info").LookupStr("title").String()

I personally lean against this variant of the proposal, as it would multiply the number of methods needed for shortcuts: LookupStr, LookupDef, LookupIndex, etc.

mvdan commented 2 months ago

Note that if we were to go down the Proposal 1 route, which is my preference, we would need to delete the existing Lookup method, and probably leave it as deleted for a full release cycle, before re-adding it with a new signature and behavior. For example, we could delete the method in v0.10.0, and re-add it in v0.11.0.

Alternatively, since existing valid calls like v.Lookup("foo") cannot possibly compile with the changed signature from ...string to ...Signature, we could repurpose the same signature as a single breaking change in e.g. v0.10.0, without first having a full release cycle with the breaking change of only deleting the func. I would personally opt for this, as long as we have consensus about the new API.

myitcv commented 2 months ago

In that spirit, I suggest that we add "inline path" APIs which do the cue.MakePath logic by themselves, reusing the deprecated and hidden APIs like Value.Lookup:

~If my memory serves me, this was the original plan: to drop the deprecated cue.Value.Lookup, and then later rename cue.Value.LookupPath -> Lookup.~

Apologies: I didn't read your suggestion closely enough.

So with proposal 1 we would end up with these two methods, correct?

func (v Value) Lookup(selectors ...Selector) Value
func (v Value) LookupPath(path Path) Value
mvdan commented 2 months ago

Correct. The signature for LookupPath is fine for cases where one does want to make use of a parsed or long-lived path, but for most cases, I want to use an "inline" path that I don't want to keep or reuse, hence the Lookup signature with ...Selector.

mpvl commented 1 month ago

Paul's comment of the original intent still stands. We first need a way to remove Lookup first, before introducing it as something else. This is quite arduous, as there are probably still many uses of Lookup around. The easiest would be to do a big cleanup as part of a v2 API. An alternative, yet, is to come up with a different name. Either way, the transition is a big part of the proposal and should be addressed.

I also think path parsing suffers from similar pitfalls mentioned before; what should cue.ParsePath("3") do?

I don't think this is nearly as ambiguous. The word Parse clearly indicates it should be an integer index. The confusion with Lookup("3") arises from the fact that the method does not specify whether the label is parsed or taken as is.

, I've come to realise that in most cases, I don't care about the path value.

Paths can be return values as well, in which case having to piece them out as selectors is tedious.

Personally, I think it would be even nicer to have a Lookup function that allows both, so Lookup(path ...X), where X can be a Path or Selector, to be applied consecutively. That way you don't have to jump through hoops if you have a field within a path to look up, which is not uncommon. We could even allow strings and ints, possibly, to make it even more convenient, if we can make it clear that this always signifies a regular field or integer index.

To avoid solving the transition issue now, the signature could be func Lookup(v Value, path ...any) Value, func (v Value) Get(path ...any) Value, or func (v Value) LookupAll(path ...any) Value or something like that.

The body would be:

func Lookup(v Value, path ...any) Value {
    for _, x := range path {
        if !v.Exists() {
            return v // or generate an error if needed.
        }
        switch x := x.(type) {
        case int:
            v = v.LookupPath(MakePath(Index(x)))
        case string:
            v = v.LookupPath(MakePath(Str(x)))
        case Selector:
            v = v.LookupPath(MakePath(x))
        case Path:
            v = v.LookupPath(x)
        }
    }
    return v
}

(In practice this would use the lower-level lookup methods within the loop, of course.)

This allows neat things, like combing parsed and non-parsed component:

type Config struct {
    root cue.Value
    offset cue.Path
}

func (c *Config) selectUser(field string) Value {
    Lookup(c.root, c.offset, field)

    // or, if we do not allow strings: 
    Lookup(c.root, c.offset, cue.Str(field))
}

Note that this variant of the proposal would allow deprecating LookupPath as well.

mvdan commented 1 month ago

I'm fine with whatever API choice we make as long as the common scenarios become simple again. I did not originally suggest a func taking parameters like ...any or sum types because I assumed that would cause the same pitfalls that made us deprecate the original Lookup method.

I agree that the first step here is likely to fully phase out the existing Lookup method first, which I describe in https://github.com/cue-lang/cue/issues/3159#issuecomment-2120482183. I would do this even if we don't plan to reuse this name for a new method, because the existing Lookup method has pitfalls and is really slow.

Another transition option is to swap func (Value) Lookup(...string) Value with func (Value) Lookup(...any) Value with your last suggested semantics in a single release, being very clear about it in the release notes. It is technically a breaking change in the Go API sense, but in practice, we could make it continue to work for virtually every existing function call. Or, at the very least, make ambiguous or incorrect arguments like Lookup("#foo") fail loudly so the user can switch to Lookup(cue.Def("foo")).

I would personally advocate for such a "swap" transition because, if we can pull it off without breaking most users, it would be far less painful for them than having to switch methods for every single one of their calls, even the trivial ones like v.Lookup("some_field"). Plus, it lets us keep Lookup as a nice method rather than as a global func or some other method name like LookupAll.

mpvl commented 1 month ago

I agree that the first step here is likely to fully phase out the existing Lookup method first, which I describe in

Ah. Missed that. I think that phasing out within the current API is not quite an option. I am not a fan of breaking API backward compatibility in such a hard way, even if we do not have a stability guarantee yet. We have a lot to clean up, so starting a v2 where things are done right makes more sense.

Or, at the very least, make ambiguous or incorrect arguments like Lookup("#foo") fail loudly so the user can switch to Lookup(cue.Def("foo")).

However appealing that would be, it is also weird and dangerous. Consider the following code:

func Foo(v Value, field string) Value {
    return v.Lookup(field)
}

It is legitimate to pass "#foo". But now the implementer would have to remember to either always use cue.Str or handle this case specifically.

I agree with you that the "swap" transition is nice, as I also think it will prevents breaking changes for the most part.

I think, with good documentation, we can make it clear what it means.

Also, we get to try if this is a viable API before fixing this in a v2 of the API.

Once we have this, we could also consider a go fix that wraps arguments in a cue.Str, cue.Index, if possible and if we decide to transition away from allowing this.

mvdan commented 1 month ago

I agree with you that the "swap" transition is nice, as I also think it will prevents breaking changes for the most part.

Pleasantly surprised that we agree on this :) I just worry about forcing nearly every Go API user to rewrite Lookup calls if the end result, a few releases down the line, would look almost exactly the same in most cases.

Also, we get to try if this is a viable API before fixing this in a v2 of the API.

For what it's worth, I prefer to tackle these issues as if v2 were not an option, to see what we can do in a matter of weeks and not years. v2, or even just stabilizing as a v1, is a big step that will force us to rethink core parts of the API like Context and Value. I don't think we need to hold up smaller bits of UX polish until v2 unless we absolutely must. We can certainly leave "cleanup TODOs" as part of this design though, such as for example removing LookupPath.

Once we have this, we could also consider a go fix that wraps arguments in a cue.Str, cue.Index, if possible and if we decide to transition away from allowing this.

There is no such "customizable go fix" today, but we could hack something together.