Function calls - Githubissues

jaredoconnell commented 1 year ago

This issue awaits approval of proposal in roundtable.

Please describe what you would like to see in this project

Function calls should be added to the expression language.

There are two types of functions that should be added to the expression language: preprocessor and runtime functions.

Runtime functions are the usual types of functions, and can take input from steps.
Preprocessor functions can only take input from literals, or potentially input variables. Will be used for enhancing security.

The syntax for a standard runtime parameter-less function call will be identifier(), where you replace identifier with the name of the function. Parameters will be comma-separated expressions, including literals. Non-literals will be evaluated before execution.

The syntax will change for preprocessor functions by adding an ! after the function name, but before the parameter parentheses. For example, identifier!()

The changes to the BNF (Backus–Naur form) are:

Adding function calls and literals to the root of the expression: root ::= identifier | "$" | function_call | preprocessor_call
Adding function calls: function_call ::= identifier "(" parameter_list ")"
Adding preprocessor calls: preprocessor_call ::= identifier "!" "(" parameter_list_literal")"
Adding parameter_list ::= "" | expression | expression, parameter_list (This is recursive as shown. Nothing is a valid value)
Adding parameter_list_literal ::= "" | value_literal | value_literal, parameter_list (This is recursive as shown. Nothing is a valid value)
Changing sub-expressions to not require () since that will just be confusing: key ::= value_literal | "(" expression ")" -> key ::= root
Adding literals to root: root ::= value_literal | identifier | "$" | function_call

Please describe your use case

The simplest use case is in place of an expression, so the value can be retrieved straight from the function.

Another use case is a function with an input of an expression. In this case, the inner expression would need to be evaluated before the function. The function would then be called with the input from the inner expression.

A more complex use case expands it further and allows an expression to be used as an input to a key in a map.

The preprocessor function calls will be used for instances where we want to run something before the rest of the workflow. For example, accessing a file outside of the workflow working directory. That file would likely be accessed in a way like file!("~/.kube/config"), and we could then prompt the user at the beginning of the workflow, rather than stalling the workflow at another point.

Implementation details

A registry of functions will be created. The structures will be similar to the ones shown below.

First, we need to define a schema for the functions, which give the inputs and outputs. These store the parameters and return types.

type ArcaflowFunctionSchema[InputType any, OutputType any] struct {
    InputSchema schema.TypedType[InputType]
    OutputSchema schema.TypedType[OutputType]
}

Next, using that, the function interface will have a function that returns the schema, and a run function that has the type and the workflow context passed in, and returns the output type or an error.

type ArcaflowFunction[InputType any, OutputType any] interface {
    Schema() ArcaflowFunctionSchema[InputType, OutputType]
    Run(input InputType, workflowContext map[string][]byte) (OutputType, error)
}

This data will be stored in a function registry. Individual functions will be defined in a sub-package of this.

Additional context

Functions available to call will be known before running the expression. That will allow there to be type info that would permit static type analysis.

ghost commented 1 year ago

@jaredoconnell another thought: the ArcaflowFunctionSchema declares its input schema. However, I think it would be better to change this to an interface, which looks like this:

type ArcaflowFunctionSchema interface {
    // InputSchemaHint gives one or more valid input types the function can accept. This is for documentation purposes only, but when run against CanAcceptInput or OutputSchema, must result in a valid result.
    InputSchemaHints() []schema.Type
    // CanAcceptInput lets a function determine if it can accept a certain input.
    CanAcceptInput(schema.Type) bool
    // OutputSchema produces an output schema given an input schema.
    OutputSchema(schema.Type) (schema.Type, error)
}

Accordingly, we may change the ArcaflowFunction to not have generic types and accept an any as input and produce an any as output. This will make functions more flexible.

dustinblack commented 6 months ago

@jaredoconnell Isn't this closed with #16 ?

jaredoconnell commented 6 months ago

Yes. This is out of sync with JIRA.

arcalot / arcaflow-expressions

Function calls #1

Please describe what you would like to see in this project

Please describe your use case

Implementation details

Additional context