apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.86k stars 4.26k forks source link

[Bug]: Performance degradation in Dataflow job when using `grpcio==1.45.0` #22159

Open ryantam626 opened 2 years ago

ryantam626 commented 2 years ago

What happened?

Runner: Dataflow runner SDK: Python Version: 2.38.0

I recently swapped to using poetry for Python dependency management (and thus implicit deps have been inadvertently upgraded), and noticed a significant performance degradation with this new setup. After a lot of binary chopping, I have come to the conclusion that upgrading from grpcio==1.44.0 to grpcio==1.45.0 probably caused the degradation.

I don't have capacity to provide a reproducible example nor debug further, apologies, hopefully this is enough.

Here are some interesting screenshots:

Dataflow job CPU util pattern with grpcio==1.44.0 Selection_502

Dataflow job CPU util pattern with grpcio==1.45.0 Selection_500

Notice how the CPU utilisation is never capped at 100% in the second screenshot, they are both working on the exact same set of input data, exact same code except with grpcio and grpcio-status version upgraded.

Issue Priority

Priority: 2

Issue Component

Component: runner-dataflow

Abacn commented 2 years ago

grpcio 1.45 is marked as "yanked" and should not be used. Could you please try if the issue persists using grpcio 1.46.0 ? If so it might be related to #22283

ryantam626 commented 2 years ago

I have also tried 1.46.3 before (this was the version poetry pulled without me pinning the package version of grpcio), same degraded performance was observed.

Abacn commented 2 years ago

@aaltay For question raised in #22283 performance regression is tracked here

kennknowles commented 1 year ago

Is this resolved now or still an issue? Have we determined this is gRPC version and discussed with upstream?

ryantam626 commented 1 year ago

I am still currently using grpcio==1.44.0 at the moment without any performance degradation. I currently still don't have bandwidth to experiment with a newer grpcio package. I will report back once I have time to experiment with this.