pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.2k stars 616 forks source link

Passing generated source files to to other targets? #17042

Open tgolsson opened 1 year ago

tgolsson commented 1 year ago

Hello!

I asked about this on Slack but got no response, so trying here instead. I'm working on a plugin for Kustomize/Kubectl, two Kubernetes tool. For those who haven't used them before, kustomize is a tool for converting multiple yaml files into one large manifest that can be fed into kubectl, which talks to the cluster.

The way I envision this working is that you first declare a target which is the "kustomize" library. This declares what files you're including, which might be other .yaml files, but also shell scripts, config files, etc. This is consumed by kustomize to generate a big .yaml file with multiple docs. I then want to feed this to my kubectl target, which contains the "how" for how to run it - cluster, etc.

All in all, it'd be something like this:

kustomize(
    name="backend",
    root_src="kustomization.yaml",
    sources=["deployment.yaml", "service.yaml"],
)

kubectl(
    name="home-cluster",
    source="WHAT_TO_PUT_HERE",
    ...
)

Since the source is generated, I'm having a hard time convincing pants that I want to use this file as the only source in another command. I want to write something like:

kubectl(
    name="home-cluster",
    source=":backend",
    ...
)

But this causes Pants to error out because there :backend isn't a file. Similarly, using the name of the generated file doesn't work either - it doesn't exist as real source file.

Am I going about this wrongly? Does pants work differently? I've found I've hit this issue also with pex_binary (generates ${name}.pex but I have to manually convert that to use it in a docker file...). Should I just use a StringField with the expected output name and hope it doesn't change name in the future?

Edit: Another note that indicates that maybe I should be able to do this is that my generator is from KustomizeSources -> KubernetesSource. And my kubectl uses a single KubernetesSource...

tgolsson commented 1 year ago

(Also sorry for using "Bug", it was either that or "Feature request")

benjyw commented 1 year ago

You're on the right track! Your kubectl target would point to your kustomize target via its dependencies field, rather than sources.

Then in your rule, when acting on a given kubectl target you can request "all the kustomize targets in this target's transitive dependencies", grab those target's sources, run kustomize on them in a Process, fish out the resulting merged yaml from the Process outputs, and then use that as input to your kubectl process.

tgolsson commented 1 year ago

@benjyw Thank you!

Ok; so I can't rely on codegen here? It felt to me like that was the natural process. If I wanted to make a python_library rule around protobuf sources, surely the python rule isn't responsible for running protoc? It just wants to use some set of python_source no matter whether they're from codegen, on-disk.

benjyw commented 1 year ago

Ah, I wasn't modeling this in my brain as a codegen problem, but now that you mention it, that does make sense!

In that interpretation, you implement two completely separate things:

  1. Support for running kubectl on a manifest
  2. Support for codgenning a manifest by running kustomize

The nice thing about this is that 1. works with both codegenned manifests as well as directly written ones.

benjyw commented 1 year ago

Presumably you've seen https://www.pantsbuild.org/docs/plugins-codegen ?

I doubt you can have dep inference from the kubectl to the kustomize since there is no import statement or anything to hang it on, so you'll still need that explicit dep.

Now in your rule you'd say "give me all the kubernetes manifests sources from my transitive deps" (and those can be either direct sources or generated sources)

tgolsson commented 1 year ago

OK; so I think we're arriving at what I have:

https://github.com/tgolsson/pants-backend-k8s/blob/main/pants-plugins/backends/kubernetes/target_types.py#L88

However; let's say there's a user error and they manage to get two .yamls in their sandbox... I still need a way to figure out which one is the file the user wants to run on.

benjyw commented 1 year ago

Yeah, this is where your use-case differs from "classic" codegen. In the normal codegen use-case you don't "act on" generated code, you place it in the sandbox to be imported as needed by the using code.

I think you might have to error in that case?

tgolsson commented 1 year ago

Makes sense; yeah. I'll have a think. It's not a hugely complicated thing to resolve the paths manually. I already do something similar to resolve some/path:bin as a label to some/path/bin.pex... Would just like it if I didn't have to :P https://github.com/tgolsson/pants-backend-k8s/blob/main/pants-plugins/backends/kustomize/codegen.py#L123

tgolsson commented 1 year ago

Just thinking out loud, I wonder if this pattern would be clearer if there was something like a SingleDependencyField. I had a look but it doesn't look so straightforward since Dependencies is a sequence...

benjyw commented 1 year ago

Even that wouldn't help you though, because that dependency could have multiple dependencies. You don't really control the transitive closure...

thejcannon commented 1 year ago

Yeah, this is where your use-case differs from "classic" codegen. In the normal codegen use-case you don't "act on" generated code, you place it in the sandbox to be imported as needed by the using code.

FWIW I think we're starting to see this assumption break down more and more. "Codegen" could be downloading a script/generating one, etc...