redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.09k stars 817 forks source link

Assigning to metadata using the @-syntax in `branch` processor's `result_map` yields `null` #2293

Open torfjor opened 9 months ago

torfjor commented 9 months ago

This works:

logger:
  level: fatal
input:
  generate:
    mapping: root = {}
    count: 1
pipeline:
  processors:
    - branch:
        processors:
          - http:
              url: https://www.google.com/does-not-exist
          - catch: []
        result_map: meta http_status_code = meta("http_status_code")
    - mapping: root.http_status_code = @http_status_code
output:
  stdout:
    codec: lines
$ benthos -c pipeline.yaml
{"http_status_code":"404"}

Whereas this does not:

logger:
  level: fatal
input:
  generate:
    mapping: root = {}
    count: 1
pipeline:
  processors:
    - branch:
        processors:
          - http:
              url: https://www.google.com/does-not-exist
          - catch: []
        result_map: meta http_status_code = @http_status_code
    - mapping: root.http_status_code = @http_status_code
output:
  stdout:
    codec: lines
$ benthos -c pipeline.yaml
{"http_status_code":null}

Am I holding the tool wrong, or is this a bug?

$ benthos -version
Version: 4.24.0
Date: 2023-11-24T12:24:31Z
Jeffail commented 9 months ago

Hey @torfjor, the documentation is outdated and pretty unhelpful but the behaviour is "correct" in the sense that it's intentional. Within a traditional mapping the new message being created, and the message that is being fed into the mapping, are the same, and so @ syntax only differs from the metadata (and meta) function in so far as changes made within the mapping are not reflected by the metadata function as it references a read-only instance of the source message.

The result_map works slightly differently, as the message being created is a mutation of the message as it was before the branch processors were enacted, and the message being fed into the mapping is the newly created message that resulted from the branch processors. This means that @ and metadata can yield entirely different metadata values as one is reading from the old message and one is reading from the new one.

We have a small section on the branch page that briefly explains the behaviour: https://www.benthos.dev/docs/components/processors/branch#metadata. However, the docs reference the older meta function and it could definitely do with being fleshed out a bit.