redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.14k stars 838 forks source link

Enhancement Request - Add option to generate message on start #1062

Open bhoriuchi opened 2 years ago

bhoriuchi commented 2 years ago

When using the generate input, there are certain use cases where the polling interval should be long but an initial message on start of the stream is useful to trigger the stream once when it starts

To summarize, the proposal is to add a configuration field to the generate input that generates a one-time initial message at startup.

Example

input:
  label: ""
  generate:
    mapping: ""
    interval: 1s
    generate_initial: true
    count: 0
Jeffail commented 2 years ago

Hey @bhoriuchi, what is the interval you're running in your use case? With an interval that's set to a discrete duration string you should be seeing an event generated immediately, the exception here is when the interval is a cron expression. It's possible that there's a bug here.

bhoriuchi commented 2 years ago

I will do some additional testing to verify that the initial message isnt being lost somewhere in processing, but I actually do have a use case for sending an initial message and using cron expressions. I would like to de-dupe generated messages based on timestamp so that if there are multiple benthos instances generating messages on a cron schedule. Having talked this out, i think i can actually accomplish this with something like

input:
  broker:
    inputs:
    # once
    - generate:
         mapping: "bloblang here"
         interval: 1s
         count: 1
    - generate:
         mapping: "bloblang here"
         interval: "cron expression"

I'll test this out but i think it will meet the requirement.