cgarciae / pypeln

Concurrent data pipelines in Python >>>
https://cgarciae.github.io/pypeln
MIT License
1.55k stars 98 forks source link

Feature Request: Performance Tuning Output #51

Open gtadamson opened 4 years ago

gtadamson commented 4 years ago

I've been wondering if you had a way to help evaluate which stages were the bottlenecks. Something like if the queue fills up in a stage, report that in some performance tuning mode. Perhaps more workers need to be allocated to that stage (or fewer if there is too much context switching), more CPU should be allocated to the container, more RAM, etc. Currently it requires some manual intervention to attempt to ascertain where the bottlenecks are.

cgarciae commented 4 years ago

Hoy @gtadamson ! This sounds like a great idea. Since version 4.x internally there exists a Supervisor class that has the sole purpose of monitoring workers which makes this feature is more plausible.