dashbitco / broadway

Concurrent and multi-stage data ingestion and data processing with Elixir
https://elixir-broadway.org
Apache License 2.0
2.44k stars 161 forks source link

GenStage 1.2.0 breaks tests #340

Closed whatyouhide closed 5 months ago

whatyouhide commented 5 months ago

I'm opening this to dump a bit of the investigation I've made so far. I bisected GenStage versions and it looks like 1.2.0 break tests. The diff is here and I’m betting this has something to do with the changes in when to emit events based on demand. I'll keep looking into this.

whatyouhide commented 5 months ago

Yeah git bisect told me culprit is https://github.com/elixir-lang/gen_stage/commit/d9cfe7ce1437fd9943ba5e29c17c07986f8ee4f4, so it's somewhere in there 🙃

whatyouhide commented 5 months ago

@josevalim interesting, this seems to be a not-so-easy-to-fix change in GenStage. With 1.2.0, when you set the demand to :accumulate, now also events are accumulated—not just demand. But as far as I understand, Broadway's draining wants to put the producer in :accumulate demand before calling prepare_for_draining so that the producer can cancel stuff in prepare_for_draining and not re-enter any demand in case it comes from the consumers.

However, this seems to change the fundamental behavior of prepare_for_draining. The ideal scenario is that prepare_for_draining is going to "flush" all events but not flush the demand, but I don't think we have a way to do that now right?

whatyouhide commented 5 months ago

This was closed by @josevalim in https://github.com/dashbitco/broadway/commit/83a564bc7d61968cc8e7f4c1546da770bbb6891e 🙃