dashbitco / broadway

Concurrent and multi-stage data ingestion and data processing with Elixir
https://elixir-broadway.org
Apache License 2.0
2.44k stars 161 forks source link

Update Documentation for Testing Pipelines #174

Closed fireproofsocks closed 4 years ago

fireproofsocks commented 4 years ago

I had a hard time getting a pipeline tested in an app where the supervisor had already started the link -- the docs for https://hexdocs.pm/ex_unit/master/ExUnit.Callbacks.html?#start_supervised/2 did not provide a complete example (I'll probably submit a PR for that separately).

Since I imagine that there are many apps using Broadway that will start their Broadway pipelines in the app's supervisor, I think it would be more than a little helpful to include an example of how to start up the pipeline for tests.

Per this forum post: https://elixirforum.com/t/restarting-a-process/30686/6

the best I could come up with was to override the :name option passed to Broadway's start_link/1 function:

defmodule My.Pipeline do
    use Broadway

    def start_link(opts) do
        Broadway.start_link(__MODULE__,
        name: Keyword.get(opts, :name, __MODULE__),
        # ... etc ...
     )
   end
   # ... 
end

And then in my test, I could do something like this:

{:ok, pid} = My.Pipeline.start_link(
                producer: Broadway.DummyProducer,
                queue_url: "fake-url",
              name: :something
              )
# ... create test messages ...
:ok = Broadway.push_messages(pid, messages)

Because Broadway.push_messages/3 does appear to do anything with its opts, I had to adjust my module to rely on Application.get_env/3 to read options and then I had to use Application.put_env/3 in my tests to override them. That seems to have worked and I think I now have a way to test the pipeline in its entirety, but I don't know if this approach is good or not. In either case, it would be helpful to include some more examples/explanations on the Broadway docs.

whatyouhide commented 4 years ago

@fireproofsocks is there a reason you're not testing the pipeline that is started in the application's supervision tree? That's what I tend to do most of the time. If there's a reason, then I think the name option is the best way to go and would be nice to document that, but that comes with caveats too: for example, if you use RabbitMQ and consume from a queue, then your application's pipeline and the test pipeline will both consume messages if you do integration testing. Just something to note :)

fireproofsocks commented 4 years ago

Yes @whatyouhide : I want to test things. Pulling the lever on whatever configurations are pegged to the test environment is imprecise at best. Since that process has already started, the cat is out of the bag: I can't override settings to test specific scenarios.

I'm kinda baffled here... am I going about this the wrong way? Am I pushing the river? I honestly do not understand the justification for NOT making this easier to do.