Closed narrowtux closed 2 years ago
Between the mapper and the enumerable there is another genstage connection (prod-consumer), I am not sure if there is a way to customize the demand for it right now. Maybe we should add a Flow.to_stream, so you can customize that step, but I also think you should move everything inside Flow instead, even the repo operations, and do Flow.run. But if you want to send a PR for to_stream it is welcoime.
When I write this code:
I expect it to print
and so on.
But when I run it, it prints this:
I think this means that the first call to Flow.map makes the consumer immediately output everything it has to offer, and doesn't properly back-pressure the demand that's slowed down by Enum.each.
In production, we are running code similar to this, except that
1..20
is a database query as a GenStage producer, and Enum.each is|> Stream.chunk_every(250) |> Enum.each(&Repo.insert_all(Schema, &1))
.What we're seeing is that the query stage will immediately load all entries for the query as fast as it can and then the node crashes due to the memory being exhausted.