carvel-dev / ytt

YAML templating tool that works on YAML structure instead of text
https://carvel.dev/ytt
Apache License 2.0
1.68k stars 137 forks source link

Add a -o jsonpath option and `jsonpath` library module. #672

Open GrahamDumpleton opened 2 years ago

GrahamDumpleton commented 2 years ago

Describe the problem/challenge you have

The usual use case for ytt is to generate a set of YAML documents. If working in the Kubernetes space this would often be fed into kubectl or kapp.

During development and testing one sometimes wants to be able to extract a sub set of resource objects from the generated YAML documents, or you may want to generate summary details on what was generated. Eg., only pass through any Kubernetes Secret resources, or show me a list of the resource types, names and namespaces they would be created in.

One can do this by piping the output of ytt into a separate tool such as yq, or into jq if using the -o json option, but it would be convenient if there was an inbuilt -o jsonpath option which behaved like the same option in kubectl. This would especially be of benefit to Kubernetes developers and admins who are familiar with the kubectl -o jsonpath behaviour, and they wouldn't need to reach for a different tool like yq or jq for many tasks.

Describe the solution you'd like

Make use of the Kubernetes go-client implementation for jsonpath as document in:

Now when we talk about kubectl there are two cases to consider. The first is that you query a single Kubernetes resource object. In this case you get back just that resource on which the jsonpath query is applied.

When you don't specify a resource by name and give just the type, you will get back a pseudo resource type List of form:

{
    "apiVersion": "v1",
    "items": [],
    "kind": "List",
    "metadata": {
        "resourceVersion": ""
    }
}

This will be used even if there was only one actual resource in the result.

It is proposed that for ytt the -o jsonpath option should always return a dictionary which follows this form. This could just be:

{
    "items": []
}

but other boilerplate where have apiVersion and List could be retained, but have the apiVersion be a Carvel specific one, like there are pseudo resource tools used in imgpkg and kapp where there are not equivalents in Kubernetes itself.

By returning a dictionary object all the time, even when there is only one resource, there is consistency and users know that if they know there is only one object they use .items[0], otherwise they have to process all items in the list. If somone has to have special checks for single object vs multiple objects it would be too confusing and hard to write the jsonpath queries/transformations.

At the same time as adding the -o jsonpath option I would expect to also see a jsonpath module added to ytt that would make available the same jsonpath library support for use in Starlark code of YAML templates, Starlark code files etc.

The only thing I don't know though is whether the jsonpath implementation used by the go-client is actually a full implementation of jsonpath or is a cut down version.

The hope is to have a more complete implementation of a jsonpath library such as can see described in:

which implements:

including arithmetic and binary comparison operators.

It doesn't need to have a separate parse step that library has as that could be dealt with internally, but the idea is that it would return Starlark objects for literals, list, dictionaries etc as would be appropriate for the expression being evaluated. So not just concatenated string result like kubectl -o jsonpath.

In effect what would expect to be able to do is:

objects = library.get(data.values.resource.type).with_data_values(data.values, plain=True).eval()
result = jsonpath.eval(objects, ".items[*].metadata.name")

which would return a list object type in this case.

It is actually a jsonpath library module rather than -o jsonpath that I am really after as can see some benefits of being able to use jsonpath syntax to perform queries and transformations against a set of objects obtained from YAML/JSON decoding of data.read(), or the result of executing eval() on a library object. One could do some things much more easily using jsonpath than hand crafted Starlark code, it could also perhaps be useful direct in YAML templates to construct something on the fly from the data.values. I just don't know whether the go-client implementation could be used for this purpose as a Starlark module or not, with it being able to do everything the original JSONPath spec defines, with results being native objects for the type of the result.

If I therefore had to choose, I would prefer a full featured jsonpath library module over -o jsonpath. I thought by also suggesting -o jsonpath it might be seen as a more viable option given that it would align better with what is possible with kubectl, so thus killing two birds with one stone.

Anything else you would like to add:

If a jsonpath library module is possible, it would need to treat a YAML document set like a list, a YAML fragment and a struct as a dictionary etc. In other words, it should be pretty transparent and not require converting stuff to native objects first.

(Also discussed in the #carvel Slack channel)


Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible" 👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

GrahamDumpleton commented 2 years ago

BTW, I could be wrongly assuming that jsonpath can do more than it can. It is the transformations one can do in jq and yq to mutate structures into different shapes that am ultimately interested in and think would be useful. It is quite possible jsonpath can't do that.

That said, some jsonpath libraries have means to pass in functions to apply to data selected by the jsonpath expression, eg.,

So a library module with that sort of flexibility may do what have in mind possible.

pivotaljohn commented 2 years ago

Whew! there's a lot in here. Thanks for the detailed and thoughful write-up, @GrahamDumpleton.

We hear two requests with overlapping functionality:

  1. Provide a command-line flag, -o jsonpath="" that reshapes the output from ytt to a projection described by a JSON Path expression.
    1. further, please return the ultimate result as a Kubernetes List resource.
  2. Provide a "ytt Standard Library" module that can transform a Starlark List of objects into a shape described by a JSON Path expression.

First off: did we get that right?

If we did, there are two categories of concern, here:

Elaborating on each point...

In-Charter for ytt?

As you noted, for the first request (command-line flag) this would be a complete subset of the functionality provided by yq/jq. These tools do a pretty bang-up job for plucking and reshaping these hierarchical documents.

ytt philosophically started as a structure-based (in contrast to an expression-based) YAML processing tool. ytt templates and overlays contain intended edits within the structure. Note that the closest ytt currently has of a path-expression syntax is Starlark/Python dot expressions (although a JSON Path-like one can be implemented within Starlark, if so desired).

So, we grant there's a fuzzy zone between the scope of these tools. But the opportunity cost is too great to also reimplement yq when yq does fit the bill. (You're not asking to implement all of yq, we don't mean to employ hyperbole, just noting the direction such a move would be going towards).

Is there something in the environment that makes composing yq into the mix infeasible/problematic/less-than-ideal?

Simplicity & Robustness Through Orthogonality

Another design principle at play within Carvel tools, in general, is an attempt to keeping the full set of a tool's features as orthogonal from each other: that is, minimize the overlap in behavior of any two features and maximize their composability.

One corollary of that principle is to seek first to enhance existing mechanisms before adding a new one. So, where Starlark/Python comprehensions cover a subset of what one can accomplish with JSON Path, we would want first to explore whether some carefully crafted idiom or syntactic sugar might make it easier (and less error prone) to tackle the more complex cases (I'm thinking of a Document that contains lists — one would use a comprehension of comprehensions in that case).