New Starlark fn does not support custom parameters nor imperative run

yhrn commented 3 years ago

Apologies if this is premature feedback - I realize that this is yet to be released officially. Feel free to close if this is already planned for.

I tried out the new Starlark fn and it's pretty limited compared to the current built in alpha functionality:

There is no way to pass parameters to the Starlark code, the function requires a custom config that only takes a single parameter source
Because it requires a custom config it can't be run imperatively.

What I would like is to be able to do something like:

kpt fn run . --image gcr.io/kpt-fn/starlark:unstable -- source="$(cat my-func.star)" some-param=xyz

Less important, but it would also be nice to be able to use a custom config structure that gets passed into the starlark function (in ctx.resource_list["functionConfig"]) like you can today with the Starlark runtime

mikebz commented 3 years ago

@mengqiy heads up

mengqiy commented 3 years ago

@yhrn Thanks a lot for your feedbacks!

There is no way to pass parameters to the Starlark code, the function requires a custom config that only takes a single parameter source

This is planed to be solved in https://github.com/GoogleContainerTools/kpt/issues/1560.

Because it requires a custom config it can't be run imperatively.

To support it, we will need to support ConfigMap as fn config. We will share the design with you when we have something.

mengqiy commented 3 years ago

This is becoming a blocker for Spotify: https://github.com/GoogleContainerTools/kpt/issues/1834#issuecomment-862089796 There are a few things planned for starlark function:

Support custom parameters: https://github.com/GoogleContainerTools/kpt/issues/1560
Support restructured results: https://github.com/GoogleContainerTools/kpt/issues/2201

Re. supporting imperative run, @frankfarzan I remember you have some thoughts. Can you please elaborate?

mengqiy commented 3 years ago

@yhrn Can you please explain a bit why imperative run is important to you? If the developer experience is the reason, we can provide a small tool to wrap the starlark script for you. The UX would be something like:

$ wrapStarlarkScript --input=path/to/source.star --output=path/to/starlark-fn-config.yaml
$ kpt fn eval --image gcr.io/kpt-fn/startlark:v0.1 --fn-config path/to/starlark-fn-config.yaml

Is that reasonable to you?

frankfarzan commented 3 years ago

wrapStarlarkScript can also be a preprocessor function:

# Expand StarlarkRun resource
$ kpt fn eval --image gcr.io/kpt-fn/generate-starlark-run --mount ...
# Run starklark function with expanded StarlarkRun (Or alternatively run starlark using eval)
$ kpt fn render

You only need to run the generate function when there's a change in the starlark script.

Alternatively, there can be a generic function that can expand KRM fields pointing to file contents which can be used for other functions like gatekeeper:

# Expand KRM fields referencing (via a setter-like comment) a file.
# This can only be run imperatively.
$ kpt fn eval --image gcr.io/kpt-fn/file-substitution --mount ...
# Run starlark function with expanded StarlarkRun
$ kpt fn render

This option is more generally applicable to executive configuration pattern:

https://kpt.dev/book/05-developing-functions/04-executable-configuration

Both of these can be done without changing kpt CLI, as they're purely optional porcelain.

If the use case is to run Starlark with eval and provide the star script using CLI arguments, you can also have starlark function effectively embed and invoke generate-starlark-run where the fnConfig is generated on the fly (or rather, there's no fnConfig in that case). Obviously, you can't use this with render. This approach doesn't make sense for gatekeeper since the policies are also used server-side, so you actually want the expanded KRM resource.

yhrn commented 3 years ago

Imperative run is important to us because we often do out-of-place rendering. Also, having a raw *.star file significantly improves the developer experience. I suppose the wrapper function idea works but I don't think it is very intuitive and it would also need to support passing parameters to the actual function.

Compare the suggested workaround to:

kpt fn run --enable-star --star-path fn.star -- foo=bar

mengqiy commented 3 years ago

I have a proposal here:

functionConfig

Add a new field called parameters in StarlarkRun functionConfig to allow users to provide custom parameters. parameters is a map from string to string.

An example StarlarkRun:

apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
  name: set-namespace-to-prod
parameters:
  namespace: prod
source: |
  def setnamespace(resources, namespace):
    for resource in resources:
      resource["metadata"]["namespace"] = namespace
  ns_value = ctx.resource_list["functionConfig"]["parameters"]["namespace"] # parameters can be accessed like this
  setnamespace(ctx.resource_list["items"], ns_value)

Developer Experience

First, user needs to install the starlark binary.

$ go get github.com/GoogleContainerTools/kpt-functions-catalog/functions/go/starlark@<the-desired-version>

Users are going to develop and iterate on this starlark script using the following command.

$ kpt fn eval --exec "starlark path/to/sourcefile.star"  -- param1=foo param2=bar

Running the above command is equivalent to the following:

$ kpt fn eval --image gcr.io/kpt-fn/starlark:v0.1 --fn-config starlark-fn-config.yaml

where starlark-fn-config.yaml is:

apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
  name: starlark-fn-config
parameters:
  param1: foo
  param2: bar
source: |
  # content of sourcefile.star
  starlark source comes here ...

To generate a StarlarkRun resource with source and parameters. Option 1: Use function gcr.io/kpt-fn/gen-starlark-run

$ kpt fn eval --image gcr.io/kpt-fn/gen-starlark-run:v0.1 --mount type=bind,src=/absolute/path/to/starlarksource,dst=/starlarksource -- param1=foo param2=bar

Option 2: Use function file-substitution. Details TBD.

Implementation Details

Changes to `starlark` binary

Currently the starlark binary (github.com/GoogleContainerTools/kpt-functions-catalog/functions/go/starlark) is only used in the starlark docker image. We are going to allow user to use it as a binary in development.

We need to make it go getable.
Allow the parameters field in StarlarkRun
It currently doesn't accept any arguments. We need to change it to accept one optional argument which is the path to a raw starlark source. And when the argument is provided, the starlark binary also accept a ConfigMap to provide parameters.

mengqiy commented 3 years ago

@yhrn WDYT?

bzub commented 3 years ago

Seems like the current implementation as well as the proposed enhancements/developer workflow are still comprised of key/value string pairs. Would it be a bad pattern if StarlarkRun used a ConfigMap for fnconfig, and reserved certain keys or key prefixes or suffixes for things like source (rename the key to main.star?), and all other data elements are used as "parameters" for the script? Could the function update its own config file if it finds a starlark script file on disk with a name that matches (main.star for example), and otherwise uses what is already in the fnconfig?

mengqiy commented 3 years ago

It sounds like you are suggesting something like this:

kpt fn eval --image gcr.io/kpt-fn/starlark:v0.1 -- source="$(cat my-func.star)" param1=foo param2=bar

ConfigMap as a functionConfig is designed to host simple key-value pairs only. If we want to do more than that, we should use CRD. It may be debatable that if reserve key source and make it accept a multi-line string is consider simple key-value pairs.

@droot @frankfarzan Thoughts?

yhrn commented 3 years ago

Regarding the proposal I think that it addresses the most important blocker from my PoW by supporting simple parameters. That said also think that this is still a significant regression compared to the built in Starlark runtime:

The developer UX is still significantly more complicated and there is a disconnect between what you run when iterating on your code and what you run "in prod". And this is IMO a pretty big deal - as a KRM advocate within our org I'm constantly faced with engineers being fatigued from learning all the new concepts and tools. The v0 Starlark functionality was a fantastic way of showing off kpt fn functionality and making it more accessible but I think that goes away with v1.
No support for complex fn params means that we can't use Starlark for writing a fn PoC using the intended API (client side CRD) and then later, possibly switch to a runtime more suitable for complex functions.

All in all, providing Starlark as a fn rather than a supported runtime just seems to give a lower fidelity experience.

mengqiy commented 3 years ago

We can support ConfigMap as the functionConfig unless I hear some major objection.

kpt fn eval --image starlark:v0.1 -- source=$(cat main.star) key1=val1 key2=val2

This should provide a simple enough developer experience. A developer can keeper iterate on it until the starlark script is ready.

After that the user may want to run the script with StarlarkRun as the functionConfig. Current imperative UX:

Edit myscript.star
Copy-paste it into myscript.yaml
Run kpt fn eval --image starlark:v0.1 --fn-config myscript.yaml

We can automate it a bit. Assume we have a function called include-file, it replaces fields using the content from a file. This function can be useful for stalark and gatekeeper. It can be yaml comment driven substitution. Then the workflow will be:

Edit myscript.star
Run kpt fn eval --image include-file --mount ...
Run kpt fn eval --image starlark:v0.1 --fn-config myscript.yaml

We can alternatively make starlark smarter. The starlark function will look the starlark script file in a specific location. If found, the starlark function will execute that script. A user can choose to mount the starlark file in the starlark container. The workflow will be:

Edit myscript.star
Run kpt fn eval --image starlark:v0.1 --mount ...

Current declarative UX:

Edit myscript.star
Copy-paste it into myscript.yaml
Run kpt fn render and the Kptfile points to myscript.yaml as the functionConfig.

We can automate it a bit: Assume include-file is same as mentioned above. The workflow will be:

Edit myscript.star
Run kpt fn eval --image include-file --mount ...
Run kpt fn render and the Kptfile points to myscript.yaml as the functionConfig.

@yhrn @bzub Thought about the above ideas?

yhrn commented 3 years ago

Being able to do kpt fn eval --image starlark:v0.1 -- source=$(cat main.star) key1=val1 key2=val2 would address my biggest remaining concern.

I guess the concern not addressed is that there is no way to pass a complex structure as configuration to the Starlark code. It means that you can no longer treat is as (mostly) an implementation detail if you choose to write your fn using Starlark or Go/Typescript, but maybe that is not a too common use case.

mengqiy commented 3 years ago

It's possible to support complex structure (which can be present by yaml) in params field in StarlarkRun. But we can't do it for ConfigMap. Otherwise, we are abusing ConfigMap.

yhrn commented 3 years ago

It's possible to support complex structure (which can be present by yaml) in params field in StarlarkRun. But we can't do it for ConfigMap. Otherwise, we are abusing ConfigMap.

Yes, that makes sense and I didn't mean that it should be supported for the ConfigMap case. It's not supported for the imperative case in v0 either so it wouldn't be worse from that perspective either.

If it's supported for the StarlarkRun case then I'm happy.

frankfarzan commented 3 years ago

Created this issue for the include-file function that's generally useful: https://github.com/GoogleContainerTools/kpt/issues/2350

mengqiy commented 3 years ago

Closed by GoogleContainerTools/kpt-functions-catalog#457

yhrn commented 3 years ago

@mengqiy shouldn't there be a release of the starlark kpt fn too before this is fully done? Right now the new functionality can only be accessed via the unstable tag.

bgrant0607 commented 3 years ago

With the code in a ConfigMap, is it possible to inline the Starlark code and parameters into the Kptfile?

mengqiy commented 3 years ago

@yhrn In case you missed it, the starlark:v0.2.0 function has been released.

mengqiy commented 3 years ago

With the code in a ConfigMap, is it possible to inline the Starlark code and parameters into the Kptfile?

@bgrant0607 It's possible. But I guess it's not recommended unless the starlark script is very short. Otherwise, it may make the Kptfile a little verbose.

kptdev / kpt