redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.14k stars 840 forks source link

Output resources connects even if unused #2523

Open alessandro-gomma opened 7 months ago

alessandro-gomma commented 7 months ago

Hi!

while building our streams we tried to avoid code duplication by replacing our outputs with a resource. While trying different configurations, we noticed that if we declare multiple output resources, they immediately connect upon startup even if these resources aren't referenced in any stream.

Here's our resource.yaml file:

output_resources:
  - label: "first_output"
    nats:
      urls: ["nats://127.0.0.1:1234"]
      subject: ${! @output_subject }
  - label: "second_output"
    nats:
      urls: ["nats://127.0.0.1:5678"]
      subject: ${! @output_subject }

Here's the stream file:

input:
  broker:
    inputs:
      [...]

pipeline:
  processors:
    [...]

output:
  label: output
  resource: "first_output"

And this is the startup script:

benthos  -e "config.env" -c "./common/common.yaml" -r "./resources/*.yaml" -t "./templates/*.yaml" streams "./config/*.yaml"

In our solution, we would like to define multiple output resources in a single yaml file. Since we plan to have plenty of different outputs in the same file, we want to be sure that only the referenced resources are actually instantiated. Is there any way to avoid the "second_output" resource to be used when not referenced? (e.g. lazy loading?)

kmpm commented 4 months ago

I have the same issue but with inputs and nats, so lazy loading would be beneficial even there.