dashbitco / broadway

Concurrent and multi-stage data ingestion and data processing with Elixir
https://elixir-broadway.org
Apache License 2.0
2.43k stars 161 forks source link

Support creating metadata in Broadway.test_messages/3 #173

Closed fireproofsocks closed 4 years ago

fireproofsocks commented 4 years ago

The implementation of Broadway.test_messages/3 is too simplistic to be useful for many use cases. It would be nice if it were possible to create metadata in the test messages instead of just mapping to the message data field.

Currently, I have to roll my own test messages something like this:

msg = %Message{
    data: "xyz",
    metadata: %{
      message_attributes: %{
        "a" => %{value: "foo"},
        "b" => %{value: "bar"},
        "c" => %{value: "zoinks"}
      }
    },
    acknowledger: "test"
  }
batch_mode = :flush
ref = make_ref()
ack = {Broadway.CallerAcknowledger, {self(), ref}, :ok}

msg1 = %Message{msg | acknowledger: ack, batch_mode: batch_mode}

messages = [msg1]

:ok = Broadway.push_messages(pid, messages)
assert_receive {:ack, ^ref, [_] = _successful, _failed}, 1000

I think an easier UI might be to be able to supply a list of maps to Broadway.test_messages/3.

whatyouhide commented 4 years ago

@fireproofsocks you are looking for a way to add specific metadata to each message, different for different messages? Because if you want to add some metadata to all messages, test_messages/3 supports a :metadata option that takes a map. Thoughts?

josevalim commented 4 years ago

Yes, please use the :metadata option. Thanks.

fireproofsocks commented 4 years ago

The metadata option is helpful, but yes: each message may require custom metadata, so unfortunately, it's not a solution that works for our use-cases. We have a lot of pipelines (and this holds true for ETLs I've worked on in Python), and I've never seen any messages that could have shared the exact same metadata.

josevalim commented 4 years ago

You can publish the messages individually though, can't you? Is there any benefit in publishing multiple of them in your tests?