Open ryantam626 opened 2 years ago
grpcio 1.45 is marked as "yanked" and should not be used. Could you please try if the issue persists using grpcio 1.46.0 ? If so it might be related to #22283
I have also tried 1.46.3 before (this was the version poetry pulled without me pinning the package version of grpcio
), same degraded performance was observed.
@aaltay For question raised in #22283 performance regression is tracked here
Is this resolved now or still an issue? Have we determined this is gRPC version and discussed with upstream?
I am still currently using grpcio==1.44.0
at the moment without any performance degradation.
I currently still don't have bandwidth to experiment with a newer grpcio
package.
I will report back once I have time to experiment with this.
What happened?
Runner: Dataflow runner SDK: Python Version: 2.38.0
I recently swapped to using poetry for Python dependency management (and thus implicit deps have been inadvertently upgraded), and noticed a significant performance degradation with this new setup. After a lot of binary chopping, I have come to the conclusion that upgrading from
grpcio==1.44.0
togrpcio==1.45.0
probably caused the degradation.I don't have capacity to provide a reproducible example nor debug further, apologies, hopefully this is enough.
Here are some interesting screenshots:
Dataflow job CPU util pattern with
grpcio==1.44.0
Dataflow job CPU util pattern with
grpcio==1.45.0
Notice how the CPU utilisation is never capped at 100% in the second screenshot, they are both working on the exact same set of input data, exact same code except with
grpcio
andgrpcio-status
version upgraded.Issue Priority
Priority: 2
Issue Component
Component: runner-dataflow