opensearch-project / data-prepper

Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
https://opensearch.org/docs/latest/clients/data-prepper/index/
Apache License 2.0
254 stars 185 forks source link

[BUG] Regex matching with Data Prepper Expression throws error when using $ #3514

Open travisbenedict opened 10 months ago

travisbenedict commented 10 months ago

Describe the bug

Trying to use a regex in a Data Prepper expression with a $ in the Regex is causing this error:

2023-10-17T16:24:04,064 [simple-pipeline-processor-worker-1-thread-1] ERROR org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator - Failed to evaluate route. This route will not be applied to any events.
org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate statement "/app =~ "-service$""
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:41) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.ExpressionEvaluator.evaluateConditional(ExpressionEvaluator.java:28) ~[data-prepper-api-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.findMatchedRoutes(RouteEventEvaluator.java:64) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.evaluateEventRoutes(RouteEventEvaluator.java:45) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.router.Router.route(Router.java:39) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.Pipeline.publishToSinks(Pipeline.java:335) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.postToSink(ProcessWorker.java:151) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.doRun(ProcessWorker.java:133) [data-prepper-core-2.5.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.run(ProcessWorker.java:60) [data-prepper-core-2.5.0.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
    at java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
    at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: org.opensearch.dataprepper.expression.ParseTreeCompositeException
    at org.opensearch.dataprepper.expression.ParseTreeParser.createParseTree(ParseTreeParser.java:78) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeParser.parse(ParseTreeParser.java:101) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeParser.parse(ParseTreeParser.java:27) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.MultiThreadParser.parse(MultiThreadParser.java:35) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.MultiThreadParser.parse(MultiThreadParser.java:20) ~[data-prepper-expression-2.5.0.jar:?]
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:37) ~[data-prepper-expression-2.5.0.jar:?]
    ... 13 more
Caused by: org.opensearch.dataprepper.expression.ExceptionOverview: Multiple exceptions (2)
|-- org.antlr.v4.runtime.InputMismatchException: null
    at org.antlr.v4.runtime.DefaultErrorStrategy.sync(DefaultErrorStrategy.java:270)
|-- org.antlr.v4.runtime.LexerNoViableAltException: null
    at org.antlr.v4.runtime.atn.LexerATNSimulator.failOrAccept(LexerATNSimulator.java:309)

In this case the condition was:

app-logs: "/app =~ \"-service$\""

This is similar to the example in the documentation: https://github.com/opensearch-project/data-prepper/blob/main/docs/expression_syntax.md#reference-table

To Reproduce Steps to reproduce the behavior:

Create a pipeline with a configuration similar to the following:

simple-pipeline:
  workers: 2
  delay: "5000"
  source:
    http:
        path: "/ingest"
  route:
    - app-logs: "/app =~ \"-service$\""
  sink:
    - stdout:
        routes:
            - app-logs                         

Send data to the pipeline

curl -k -XPOST -H "Content-Type: application/json" -d '[{"app": "-service"}]' http://localhost:2021/ingest

Check logs

Expected behavior The provided expression should be able to be parsed.

dlvenable commented 10 months ago

You should be able to escape the dollar sign - \$. I think we can help here in two ways.

  1. Update the documentation to clearly state that $ is currently reserved and needs to be escaped.
  2. The reason we reserve $ is for our functions. Perhaps we can look for ${ instead of $ for those functions. Then it may not need to be reserved.
ipsi commented 7 months ago

Running Data Prepper 2.6.1, Docker Compose, receiving logs via source: http: ssl: false (the actual pipeline is a bit more involved, can provided as needed).

FYI, it does not appear to work when escaping $. No errors, no relevant entries at all in the logs, even at trace level, it just straight up doesn't work.

Given the field "container_name":"/firefly_iii_core" and the configuration

route:
  - firefly-core: '/container_name =~ "^.firefly_iii_core\$"'

Nothing happens - the route isn't executed.

Dropping the $ results in the route working:

route:
  - firefly-core: '/container_name =~ "^.firefly_iii_core"'

Tangent In what might be a related issue (or might need a new issue), I ran into this trying to understand why / doesn't seem to be handled correctly in == and =~ comparisons. e.g.,

route:
  - firefly-core: '/container_name == "/firefly_iii_core"'

Also doesn't work, again with absolutely nothing in the logs.

Escaping it with \/ results in a ParseTreeCompositeException, much the same as the initial error with both == and =~:

route:
  - firefly-core: '/container_name == "\/firefly_iii_core"'
  - firefly-c0re: '/container_name =~ "\/firefly_iii_core"'
data-prepper  | 2024-02-03T17:53:52,856 [docker2-processor-worker-5-thread-1] ERROR org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator - Failed to evaluate route. This route will not be applied to any events.
data-prepper  | org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate statement "/container_name =~ "\/firefly_iii_core""
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:42) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ExpressionEvaluator.evaluateConditional(ExpressionEvaluator.java:28) ~[data-prepper-api-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.findMatchedRoutes(RouteEventEvaluator.java:64) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.evaluateEventRoutes(RouteEventEvaluator.java:45) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.Router.route(Router.java:39) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.Pipeline.publishToSinks(Pipeline.java:346) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.postToSink(ProcessWorker.java:157) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.doRun(ProcessWorker.java:139) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.run(ProcessWorker.java:61) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
data-prepper  |         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
data-prepper  |         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
data-prepper  |         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
data-prepper  |         at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
data-prepper  | Caused by: org.opensearch.dataprepper.expression.ParseTreeCompositeException
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeParser.createParseTree(ParseTreeParser.java:78) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeParser.parse(ParseTreeParser.java:101) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeParser.parse(ParseTreeParser.java:27) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.MultiThreadParser.parse(MultiThreadParser.java:35) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.MultiThreadParser.parse(MultiThreadParser.java:20) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:38) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         ... 13 more
data-prepper  | Caused by: org.opensearch.dataprepper.expression.ExceptionOverview: Multiple exceptions (3)
data-prepper  | |-- java.lang.NullPointerException: Throwable was null!
data-prepper  |     at org.opensearch.dataprepper.expression.ParseTreeCompositeException.mapNullToNullPointer(ParseTreeCompositeException.java:35)
data-prepper  | |-- java.lang.NullPointerException: Throwable was null!
data-prepper  |     at org.opensearch.dataprepper.expression.ParseTreeCompositeException.mapNullToNullPointer(ParseTreeCompositeException.java:35)
data-prepper  | |-- org.antlr.v4.runtime.LexerNoViableAltException: null
data-prepper  |     at org.antlr.v4.runtime.atn.LexerATNSimulator.failOrAccept(LexerATNSimulator.java:309)

Fun fact:

route:
  - firefly-core: '/container_name =~ "/firefly_iii_core"'

Triggers an completely unexpected exception:

data-prepper  | 2024-02-03T17:52:52,737 [docker2-processor-worker-5-thread-1] ERROR org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator - Failed to evaluate route. This route will not be applied to any events.
data-prepper  | org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate statement "/container_name =~ "/firefly_iii_core""
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:42) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ExpressionEvaluator.evaluateConditional(ExpressionEvaluator.java:28) ~[data-prepper-api-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.findMatchedRoutes(RouteEventEvaluator.java:64) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.RouteEventEvaluator.evaluateEventRoutes(RouteEventEvaluator.java:45) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.router.Router.route(Router.java:39) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.Pipeline.publishToSinks(Pipeline.java:346) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.postToSink(ProcessWorker.java:157) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.doRun(ProcessWorker.java:139) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.pipeline.ProcessWorker.run(ProcessWorker.java:61) [data-prepper-core-2.6.1.jar:?]
data-prepper  |         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
data-prepper  |         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
data-prepper  |         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
data-prepper  |         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
data-prepper  |         at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
data-prepper  | Caused by: org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate the part of input statement: /container_name =~ "/firefly_iii_core"
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:41) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         ... 13 more
data-prepper  | Caused by: org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate the part of input statement: /container_name =~ "/firefly_iii_core"
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:91) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:63) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:38) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:37) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         ... 13 more
data-prepper  | Caused by: java.lang.IllegalArgumentException: '=~' requires right operand to be String.
data-prepper  |         at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143) ~[guava-32.1.2-jre.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericRegexMatchOperator.evaluate(GenericRegexMatchOperator.java:41) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericRegexMatchOperator.evaluate(GenericRegexMatchOperator.java:16) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.performSingleOperation(ParseTreeEvaluatorListener.java:104) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:88) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:63) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:38) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:37) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.6.1.jar:?]
data-prepper  |         ... 13 more

If these look like different errors, I'm happy to raise a separate issue for that (or several!).