google / cel-java

Fast, portable, non-Turing complete expression evaluation with gradual typing (Java)
https://cel.dev
Apache License 2.0
162 stars 21 forks source link
cel expression expression-language expression-parser java

Common Expression Language for Java

The Common Expression Language (CEL) is a non-Turing complete language designed for simplicity, speed, safety, and portability. CEL's C-like syntax looks nearly identical to equivalent expressions in C++, Go, Java, and TypeScript.

// Check whether a resource name starts with a group name.
resource.name.startsWith("/groups/"+auth.claims.group)
// Determine whether the request is in the permitted time window.
request.time - resource.age < duration("24h")
// Check whether all resource names in a list match a given filter.
auth.claims.email_verified && resources.all(r, r.startsWith(auth.claims.email))

A CEL "program" is a single expression. The examples have been tagged as java, go, and typescript within the markdown to showcase the commonality of the syntax.

CEL is ideal for lightweight expression evaluation when a fully sandboxed scripting language is too resource intensive.



Quick Start

Install

CEL-Java is available in Maven Central Repository. Download the JARs here or add the following to your build dependencies:

Maven (pom.xml):

<dependency>
  <groupId>dev.cel</groupId>
  <artifactId>cel</artifactId>
  <version>0.8.0</version>
</dependency>

Gradle

implementation 'dev.cel:cel:0.8.0'

Then run this example:

import dev.cel.common.CelAbstractSyntaxTree;
import dev.cel.common.CelValidationException;
import dev.cel.common.types.SimpleType;
import dev.cel.compiler.CelCompiler;
import dev.cel.compiler.CelCompilerFactory;
import dev.cel.runtime.CelEvaluationException;
import dev.cel.runtime.CelRuntime;
import dev.cel.runtime.CelRuntimeFactory;
import java.util.Map;

public class HelloWorld {
  // Construct the compilation and runtime environments.
  // These instances are immutable and thus trivially thread-safe and amenable to caching.
  private static final CelCompiler CEL_COMPILER =
      CelCompilerFactory.standardCelCompilerBuilder().addVar("my_var", SimpleType.STRING).build();
  private static final CelRuntime CEL_RUNTIME =
      CelRuntimeFactory.standardCelRuntimeBuilder().build();

  public void run() throws CelValidationException, CelEvaluationException {
    // Compile the expression into an Abstract Syntax Tree.
    CelAbstractSyntaxTree ast = CEL_COMPILER.compile("my_var + '!'").getAst();

    // Plan an executable program instance.
    CelRuntime.Program program = CEL_RUNTIME.createProgram(ast);

    // Evaluate the program with an input variable.
    String result = (String) program.eval(Map.of("my_var", "Hello World"));
    System.out.println(result); // 'Hello World!'
  }
}

Overview

Determine the variables and functions you want to provide to CEL. Parse and check an expression to make sure it's valid. Then evaluate the output AST against some input. Checking is optional, but strongly encouraged.

Environment Setup

Configuration for the entire CEL stack can be done all at once via the CelFactory.standardCelBuilder(), or can be composed into compilation and evaluation via the CelCompilerFactory and CelRuntimeFactory.

The simplest form of CEL usage is as follows:

Cel cel = CelFactory.standardCelBuilder().build();

More commonly, your application will want to configure type-checking separately from the runtime. Use CelCompilerFactory to construct a compilation environment and declare the types, macros, variables, and functions to use with your CEL application:

// Example environment for the following expression:
//    resource.name.startsWith('/groups/' + group)
CelCompiler cel = CelCompilerFactory.standardCelCompilerBuilder()
    .setStandardMacros(CelStandardMacro.HAS)
    .setContainer("google.rpc.context.AttributeContext")
    .addMessageTypes(AttributeContext.getDescriptor())
    .addVar("resource",
        StructTypeReference.create("google.rpc.context.AttributeContext.Resource"))
    .addVar("group", SimpleType.STRING)
    .build();

More information about the features which are supported on the builder may be found in the CelCompilerBuilder.

Parsing

Some CEL use cases only require parsing of an expression in order to be useful. For example, one example might want to check whether the expression contains any nested comprehensions, or possibly to pass the parsed expression to a C++ or Go binary for evaluation. Presently, Java does not support parse-only evaluation.

CelValidationResult parseResult =
    cel.parse("resource.name.startsWith('/groups/' + group)");
try {
  return parseResult.getAst();
} catch (CelValidationException e) {
  // Handle exception...
}

Checking

Type-checking is performed on CelAbstractSyntaxTree values to ensure that the expression is well formed and all variable and function references are defined.

Type-checking can be performed immediately after parsing an expression:

try {
  CelValidationResult parseResult =
      cel.parse("resource.name.startsWith('/groups/' + group)");
  CelValidationResult checkResult = cel.check(parseResult.getAst());
  return checkResult.getAst();
} catch (CelValidationException e) {
  // Handle exception...
}

Or, the parse and type-check can be combined into the compile call. This is likely the more common need.

CelValidationResult compileResult =
    cel.compile("resource.name.startsWith('/groups/' + group)");
try {
  return compileResult.getAst();
} catch (CelValidationException e) {
  // Handle exception...
}

Macros

Macros were introduced to support optional CEL features that might not be desired in all use cases without the syntactic burden and complexity such features might desire if they were part of the core CEL syntax. Macros are expanded at parse time and their expansions are type-checked at check time.

For example, when macros are enabled it is possible to support bounded iteration / fold operators. The macros all, exists, exists_one, filter, and map are particularly useful for evaluating a single predicate against list and map values.

// Ensure all tweets are less than 140 chars
tweets.all(t, t.size() <= 140)

The has macro is useful for unifying field presence testing logic across protobuf types and dynamic (JSON-like) types.

// Test whether the field is a non-default value if proto-based, or defined
// in the JSON case.
has(message.field)

Both cases traditionally require special syntax at the language level, but these features are exposed via macros in CEL.

Refer to the CEL Specification for full listings of available macros. To leverage them, simply set the desired macros via setStandardMacros on the builder:

CelCompiler.standardCelBuilder()
  .setStandardMacros(CelStandardMacro.STANDARD_MACROS)

Evaluation

Expressions can be evaluated using once they are type-checked/compiled by creating a CelRuntime.Program from a CelAbstractSyntaxTree:

CelRuntime celRuntime = CelRuntimeFactory.standardCelRuntimeBuilder().build();
try {
  CelRuntime.Program program = celRuntime.createProgram(compileResult.getAst());
  return program.eval(
      ImmutableMap.of(
          "resource", Resource.newBuilder().setName("/groups/").build(),
          "group", "admin"
      ));
} catch (CelEvaluationException e) {
  // Handle evaluation exceptions ...
}

The evaluation is thread-safe and side effect free thus many different inputs can be sent to the same cel.Program.

Partial State

In distributed apps it is not uncommon to have edge caches and central services. If possible, evaluation should happen at the edge, but it isn't always possible to know the full state required for all values and functions present in the CEL expression.

To improve the odds of successful evaluation with partial state, CEL uses commutative logical operators &&, ||. If an error or unknown value (not the same thing) is encountered on the left-hand side, the right-hand side is evaluated also to determine the outcome. While it is possible to implement evaluation with partial state without this feature, this method was chosen because it aligns with the semantics of SQL evaluation and because it's more robust to evaluation against dynamic data types such as JSON inputs.

In the following truth-table, the symbols <x> and <y> represent error or unknown values, with the ? indicating that the branch is not taken due to short-circuiting. When the result is <x, y> this means that the both args are possibly relevant to the result.

Expression Result
false && ? false
true && false false
<x> && false false
true && true true
true && <x> <x>
<x> && true <x>
<x> && <y> <x, y>
true \|\| ? true
false \|\| true true
<x> \|\| true true
false \|\| false false
false \|\| <x> <x>
<x> \|\| false <x>
<x> \|\| <y> <x, y>

Errors

Parse and check errors have friendly error messages with pointers to where the issues occur in source:

ERROR: <input>:1:40: undefined field 'undefined'
    | TestAllTypes{single_int32: 1, undefined: 2}
    | .......................................^`,

Both the parsed and checked expressions contain source position information about each node that appears in the output AST. This information can be used to determine error locations at evaluation time as well.

Extensions

CEL-Java offers a suite of canonical extensions to support commonly needed features that falls outside the CEL specification.

Examples:

// String manipulation
'hello hello'.replace('he', 'we')     // returns 'wello wello'
'hello hello hello'.split(' ')     // returns ['hello', 'hello', 'hello']

// Math extensions
math.greatest(-42.0, -21.5, -100.0)   // -21.5
math.least(-42.0, -21.5, -100.0)   // -100.0

// Proto extensions
proto.getExt(msg, google.expr.proto2.test.int32_ext) // returns int value

// Local bindings
cel.bind(a, 'hello',
    cel.bind(b, 'world', a + b + b + a)) // "helloworldworldhello"

Common Questions

Why not JavaScript, Lua, or WASM?

JavaScript and Lua are rich languages that require sandboxing to execute safely. Sandboxing is costly and factors into the "what will I let users evaluate?" question heavily when the answer is anything more than O(n) complexity.

CEL evaluates linearly with respect to the size of the expression and the input being evaluated when macros are disabled. The only functions beyond the built-ins that may be invoked are provided by the host environment. While extension functions may be more complex, this is a choice by the application embedding CEL.

But, why not WASM? WASM is an excellent choice for certain applications and is far superior to embedded JavaScript and Lua, but it does not have support for garbage collection and non-primitive object types require semi-expensive calls across modules. In most cases CEL will be faster and just as portable for its intended use case, though for node.js and web-based execution CEL too may offer a WASM evaluator with direct to WASM compilation.

Do I need to Parse and Check?

Checking is an optional, but strongly suggested, step in CEL expression validation. It is sufficient in some cases to simply Parse and rely on the runtime bindings and error handling to do the right thing.

Where can I learn more about the language?

How can I contribute?

Dependencies

Java 8 or newer is required.

Library Details
Guava N/A
RE2/J N/A
Protocol Buffers Full or lite runtime
ANTLR4 Java runtime

License

Released under the Apache License.

Disclaimer: This is not an official Google product.