Add support for processing Kubernetes list from "kubectl"

kohtala commented 2 years ago

What did you do?

kubectl get -A all -o yaml > k8s.yml
go get github.com/patrickdappollonio/kubectl-slice
~/go/bin/kubectl-slice -f k8s.yml -o .

It installed v1.1.0.

What did you expect to see?

The list items be split to individual files.

What did you see instead?

One list-.yml file equal to the k8s.yml.

patrickdappollonio commented 2 years ago

Hey @kohtala appreciate the report!

This is an interesting one. Mainly because the output is not a "multi YAML" but instead a Kubernetes resource of type List. We could split it too but I think we might have to introduce either a check-before-processing or a flag to make it work.

If you have any preference on that, happy to hear it! It's not a hard or difficult feature to implement at all!

kohtala commented 2 years ago

Thanks. The List is not any kind of resource. I searched for documentation and found some in the API Concepts

Keep in mind that the Kubernetes API does not have a kind named List.

kind: List is a client-side, internal implementation detail for processing collections that might be of different kinds of object. Avoid depending on kind: List in automation or other code.

So I think it should be safe to perform this processing by default. I believe there also is never a name property. It is best if the README.md were clear it handles by default both kinds of lists.

A flag to disable the feature might become necessary, if someone finds they need to split lists out of a multi YAML. But I believe this to be rare. But I've been wrong before.

patrickdappollonio commented 2 years ago

Yes, sorry, I used the "Kubernetes object" very loosely. It is a definition/spec in the kubectl app so it's easier to word it that way, although inaccurate.

My main concern is also outlined in what you shared, especifically:

Avoid depending on kind: List in automation or other code.

What this means is that the implementation might change. It could help us with the direction though, perhaps the way to go is having something like:

kubectl-slice -f k8s.yaml --as-list -o .

Where the --as-list would process it.

My concern to not auto-add this feature is because we are parsing multi-yamls, so something like this:

file1: true
---
file2: true
---
file3: false

And the List kind does not match this input, so it generates some sort of a side consequence (as in, now you have to educate and say "btw! remember that tool that splits multi-yamls? well you can also feed in a List and it'll work too!).

Open to hear your thoughts around it! You mention at the end to have the feature opt-out (as in, process List like if it was any other multi-yaml) while I suggest it as opt-in with --as-list. Curious to know your use case!

kohtala commented 2 years ago

I understood the reference

Avoid depending on kind: List in automation or other code.

to refer to object validation and such automation. For example helm should not be used to generate kind Lists because kubectl may not be involved to validate and apply the List to cluster so there is no-one to understand List.

Looking at the issues from where the documentation came (https://github.com/kubernetes/website/issues/25944), I believe kubectl just happened to use List instead of a multi YAML for kubectl get to produce one file the kubectl apply or delete can take in. Unfortunately I found no clear documentation on the file format in the kubectl reference.

Since this program is already called kubectl-slice and it works invoked with kubectl slice, it would seem natural it supports the kubectl List without additional options just like the kubectl apply and delete do.

patrickdappollonio commented 2 years ago

Hey @kohtala,

The only reason this program starts with kubectl- is because it's a requirement to be used/installed as a plugin from krew. Otherwise, like many of my other apps, it could be called just slice 😄

(Side note, the original app was even called split, and when we worked on #1 is when the app got renamed from split to kubectl-slice, because split already exists in the Krew index)

You mention "it would seem natural it supports the kubectl List without additional options" but if we look at the original intention of the software, which is to parse multi-YAML files (files separated by --- concatenated into a single file). A list isn't one of them, it's just an object wrapper at that point.

Adding this would mean the current people using this software would now magically receive the ability to process List which they might not be expecting, and it opens a bigger window of details that I'm not sure we'll get there. For example, consider I have the following YAML:

---
first: true
---
apiVersion: v1
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
items:
- apiVersion: v1
  kind: Service
  metadata:
    name: kubernetes
    namespace: default
- apiVersion: v1
  kind: Service
  metadata:
    name: kube-dns
    namespace: kube-system

... Do we process the second file as a list and we get in return, say, 3 files instead of 2 just because there's a list inside?

This is why I think we might need more feedback on this. As before, if you had a concrete use case scenario, please let me know, it would definitely help shape the direction of kubectl-slice.

patrickdappollonio commented 2 years ago

After watching the video from the SIG APIMachinery linked from https://github.com/kubernetes/website/issues/25944 it's even more evident this seems a client-side only, specific to kubectl and not valid in other resources -- there are talks about Helm, automation software like ArgoCD and Flux, and others -- so I would still be inclined to support it, just under a flag, something like:

kubectl get pods -o yaml | kubectl slice --from-list

Where the --from-list will disable multi-YAML parsing and, instead, enable kind: List processing. That way, if they ever decide not to move forward with this type of list, we can quickly remove the code and no side consequences would happen to the app moving forward.

Still, would love to know what everyone else thinks.

cbmdfc commented 2 years ago

been keeping an eye on this discussion, and I think this should be a feature flag rather than built-in behavior... I wouldn't want the CLI to magically change how it parses resources

either that or release a new major version and announce it as kind of a breaking change but not really?

your last comment @patrickdappollonio makes sense, kubectl-slice --from-list makes lots of sense/ I would want this feature to be in because I use lists too, but the feature flag doesn't seem that bothersome

douglasmakey commented 2 years ago

I agree that this should be a feature flag rather than the default behaviour. For me, keeping the CLI behaviour works with multi-yaml is more intuitive and familiar.

kohtala commented 2 years ago

I am a stranger to this project and definitely do not feel like I would need to tell you what to do. There must be lots of uses I do not know about. As someone that just found this utility, the naming got me thinking it had more to do with kubectl than you seem to think it has. I do not know if other users would find the option as much strange.

I tried to see if kubectl passes some information to the plugin it is being invoked through it. I did not find it, but if the plugin has some way of knowing whether it was invoked through kubectl, I think that could be used to change the behavior to match kubectl apply etc.

If there is a change to behaviour, a major version is a good point.

patrickdappollonio commented 2 years ago

I think based on the previous responses, it's better to leave it off as a feature flag.

I tried to see if kubectl passes some information to the plugin it is being invoked through it. I did not find it, but if the plugin has some way of knowing whether it was invoked through kubectl, I think that could be used to change the behavior to match kubectl apply etc.

I think this wouldn't work. If your easiest way to install the app is through Krew, then you really don't need kubectl to run it, yet you must use it because of krew. At that point, if we detect if the plugin was invoked through Kubectl, chances are the answer will almost be "always" 😛

If there is a change to behaviour, a major version is a good point.

I would still be afraid of implementing it without a flag. It seems to me the flag is a better option overall and matches the other people's thoughts.

I am a stranger to this project and definitely do not feel like I would need to tell you what to do.

I appreciate the thought! Really do 😄 the reason why we're asking you for your feedback is not to tell me what to do, but instead, explain your requirement and your use case. Yes, at the end of the day it might be my decision, but I would rather leave it to the collective what would work best for everyone.

You happen to run into a behaviour which is part of the process and even a misunderstanding for Kubernetes too, I wouldn't blame you for consider it standard behaviour.

I would love to go forward, I've needed this feature myself as well, and as such, I would love to implement it, but based on everyone else's feedback, I would put it under a feature flag.

kohtala commented 2 years ago

My use case was really in the original post. I just get bunch of objects from a Kubernetes cluster, but need to have them in separate files. This time I was running trivy config on them to see if it reports configuration issues for a cluster I had running and it does not scan into a List. I could make an issue for them as well, but thought this could be a nice addition to this tool.

A feature flag is fine. If there is a change of mind later, an opposite flag can be added when the major version is changed.

What I ment with the telling you what to do was that I can't go into the solution in detail like if you should use a flag or not. I find some people make up their mind and insist on it. I do not. I try to offer the background that I have, which is not much. You have my blessing to go forward on which ever way you decide to do it based on your wider and deeper background.

Thank you.

patrickdappollonio commented 1 year ago

Hello everyone (and @kohtala)!

Sorry for the long pause. I had a recommendation from a coworker who proposed that the list-to-multi-yaml could be a subcommand rather than part of the expected behaviour of this program.

For example, consider the following:

kubectl get pods -o yaml | kubectl slice list-to-yaml -f - -o all.yaml

This would read the list and parse it as a multi-YAML file. Then you could pipe its output to normal kubectl slice as usual:

kubectl slice -f all.yaml

Or everything done in a single shot with UNIX pipes:

kubectl get pods -o yaml | kubectl slice list-to-yaml -f - --stdout | kubectl slice -f - --stdout

This way:

It's an opt-in and not a mandatory behaviour.
People who really need the feature can just get familiar with the subcommand without getting familiar with the rest of the tool
If Kubernetes ever decides to get rid of this behaviour (since it's expected not to be built on top of this) then we just remove the subcommand.

Let me know what you all think!

kohtala commented 1 year ago

Since Kubernetes API does not have a kind named List, I think a single document YAML (as opposed to multi document YAML) with this kubectl specific kind does not need this complexity.

But if it is how you feel you like it done, please do so. This does have the advantage, that there is a generic list-to-yaml tool should one be useful for any other tasks.

patrickdappollonio commented 1 year ago

True, but again, adding it to the main software part (the slicing) without a commitment from the Kubernetes org to have it be consistent seems wrong.

Making it into a separate subcommand means you can just strip off that part of the command instead and/or produce fixes for just that part, without touching the main part of the software.

cbmdfc commented 1 year ago

100% agree with you @patrickdappollonio subcommand is good

beyondcloud-co commented 4 months ago

A practical solution with a simple pipe to yq

kubectl get pods -o yaml | yq .items[] -y | kubectl-slice -o /tmp/pods

patrickdappollonio commented 4 months ago

This is accurate, but I see the value of not bringing multiple tools/libraries and keeping a pipeline slim.

All that said, looking back at the thread, I'm still unsure if we should proceed as a subcommand or as a feature flag. I have in writing that a handful of folks prefer the flag, but I can also see the value of having a standalone subcommand (which would entice people to download the app for generic purposes while at the same time serving the original goal; on the other end, I don't want to integrate a feature that it isn't "standard" into a "standard" tool).

Will think about this a bit more (and if any of you have any opinion/preference that could skew the balance, happy to hear it too!) and come back once I have a plan to move forward.

We recently added support for directory handling and that was, too, via a feature flag, and it seems well received, maybe this could vouch for a feature flag too.

patrickdappollonio / kubectl-slice