opensearch-project / data-prepper

Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
https://opensearch.org/docs/latest/clients/data-prepper/index/
Apache License 2.0
256 stars 188 forks source link

Retrieve last item of a list #4354

Open lduriez opened 5 months ago

lduriez commented 5 months ago

Hello

Is your feature request related to a problem?

It may already be possible, but I would like a mutate to retrieve last item of a list. For example:

{
  "my-list": [
    "key1": "value1",
    "key2": "value2",
    "key3": "value3"
  ]
}

Would become:

{
  ...
  "lastkey": "value3"
}

Additional context

I tried with add_entries and using -1 to get it:

  processor:
    - add_entries:
        entries:
          - key: "my-key"
            value_expression: /my-list/-1

But it didn't work I had the following error:

2024-03-28T16:30:37,711 [waf-log-pipeline-processor-worker-1-thread-1] ERROR org.opensearch.dataprepper.expression.ParseTreeEvaluator - Unable to evaluate event
org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate the part of input statement: /my-list/-1
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:91) ~[data-prepper-expression-2.7.0.jar:?]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:63) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:38) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:37) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.plugins.processor.mutateevent.AddEntryProcessor.doExecute(AddEntryProcessor.java:61) ~[mutate-event-processors-2.7.0.jar:?]
    at org.opensearch.dataprepper.model.processor.AbstractProcessor.lambda$execute$0(AbstractProcessor.java:54) ~[data-prepper-api-2.7.0.jar:?]
    at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:69) [micrometer-core-1.11.5.jar:1.11.5]
    at org.opensearch.dataprepper.model.processor.AbstractProcessor.execute(AbstractProcessor.java:54) [data-prepper-api-2.7.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.doRun(ProcessWorker.java:135) [data-prepper-core-2.7.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.run(ProcessWorker.java:61) [data-prepper-core-2.7.0.jar:?]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
    at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: java.lang.IllegalArgumentException: DIVIDE requires left operand to be either Float or Integer.
    at org.opensearch.dataprepper.expression.ArithmeticBinaryOperator.evaluate(ArithmeticBinaryOperator.java:46) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ArithmeticBinaryOperator.evaluate(ArithmeticBinaryOperator.java:16) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.performSingleOperation(ParseTreeEvaluatorListener.java:104) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:88) ~[data-prepper-expression-2.7.0.jar:?]
    ... 18 more
2024-03-28T16:30:37,717 [waf-log-pipeline-processor-worker-1-thread-1] ERROR org.opensearch.dataprepper.plugins.processor.mutateevent.AddEntryProcessor - Error adding entry to record [org.opensearch.dataprepper.model.log.JacksonLog@50405800] with key [my-key], metadataKey [null], value_expression [/my-list/-1] format [null], value [null]
org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate statement "/my-list/-1"
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:42) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.plugins.processor.mutateevent.AddEntryProcessor.doExecute(AddEntryProcessor.java:61) ~[mutate-event-processors-2.7.0.jar:?]
    at org.opensearch.dataprepper.model.processor.AbstractProcessor.lambda$execute$0(AbstractProcessor.java:54) ~[data-prepper-api-2.7.0.jar:?]
    at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:69) [micrometer-core-1.11.5.jar:1.11.5]
    at org.opensearch.dataprepper.model.processor.AbstractProcessor.execute(AbstractProcessor.java:54) [data-prepper-api-2.7.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.doRun(ProcessWorker.java:135) [data-prepper-core-2.7.0.jar:?]
    at org.opensearch.dataprepper.pipeline.ProcessWorker.run(ProcessWorker.java:61) [data-prepper-core-2.7.0.jar:?]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
    at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate the part of input statement: /my-list/-1
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:41) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.7.0.jar:?]
    ... 11 more
Caused by: org.opensearch.dataprepper.expression.ExpressionEvaluationException: Unable to evaluate the part of input statement: /my-list/-1
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:91) ~[data-prepper-expression-2.7.0.jar:?]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:63) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:38) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:37) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.7.0.jar:?]
    ... 11 more
Caused by: java.lang.IllegalArgumentException: DIVIDE requires left operand to be either Float or Integer.
    at org.opensearch.dataprepper.expression.ArithmeticBinaryOperator.evaluate(ArithmeticBinaryOperator.java:46) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ArithmeticBinaryOperator.evaluate(ArithmeticBinaryOperator.java:16) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.performSingleOperation(ParseTreeEvaluatorListener.java:104) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluatorListener.exitEveryRule(ParseTreeEvaluatorListener.java:88) ~[data-prepper-expression-2.7.0.jar:?]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:63) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:38) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:36) ~[antlr4-runtime-4.10.1.jar:4.10.1]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:37) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.ParseTreeEvaluator.evaluate(ParseTreeEvaluator.java:17) ~[data-prepper-expression-2.7.0.jar:?]
    at org.opensearch.dataprepper.expression.GenericExpressionEvaluator.evaluate(GenericExpressionEvaluator.java:39) ~[data-prepper-expression-2.7.0.jar:?]
    ... 11 more
dlvenable commented 5 months ago

Our current syntax uses JSON Pointer. So we could support this if it does.

lduriez commented 5 months ago

Actually I don't know if JSON Pointer allow it, but I could do it with logstash with the following configuration :

filter{
    ruby { 
      code => ' 
        if event.get("[action]") == "BLOCK" || event.get("[action]") == "CHALLENGE"
          event.set("terminatingRule.ruleId", event.get("[ruleGroupList][-1][terminatingRule][ruleId]"))
          event.set("terminatingRule.action", event.get("[ruleGroupList][-1][terminatingRule][action]"))
          event.set("terminatingRule.ruleMatchDetails", event.get("[ruleGroupList][-1][terminatingRule][ruleMatchDetails]"))
        end
      '
    }
}

But I manage to achieve what I want on my use case by doing this :

  processor:
    - flatten:
        source: "ruleGroupList"
        target: "ruleGroupList_flattened"
        exclude_keys: ["ruleGroupId","nonTerminatingMatchingRules","excludedRules","customerConfig"]
        remove_list_indices: true
        flatten_when: /action != "ALLOW"
    - add_entries:
        entries:
          - key: "/terminatingRule/ruleId"
            format: "${/ruleGroupList_flattened/[].terminatingRule.ruleId}"
    - delete_entries:
        with_keys: ["ruleGroupList_flattened"]

Input looks like :

{
    "ruleGroupList": [
        {
            "ruleGroupId": "arn:aws:wafv2:us-east-1:***:global/rulegroup/secret",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesCommonRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesKnownBadInputsRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesPHPRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesLinuxRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesSQLiRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesUnixRuleSet",
            "terminatingRule": null,
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": null
        },
        {
            "ruleGroupId": "AWS#AWSManagedRulesBotControlRuleSet",
            "terminatingRule": {
                "ruleId": "CategoryHttpLibrary",
                "action": "BLOCK",
                "ruleMatchDetails": null
            },
            "nonTerminatingMatchingRules": [],
            "excludedRules": null,
            "customerConfig": [
                {
                    "name": "InspectionLevel",
                    "value": "COMMON"
                },
                {
                    "name": "EnableMachineLearning",
                    "value": "null"
                }
            ]
        }
    ]
}

Output looks like :

{
    "terminatingRule": {
        "ruleId": "CategoryHttpLibrary"
    }
}