Open quasiben opened 3 years ago
It's worth noting that transition_released_forgotten
has dropped out of the call graph. That transition is still around. It just now takes less than 0.5% of the time so has dropped out of the picture. As all the other transitions take 1% or more, it should be safe to ignore transition_released_forgotten
.
To add to that observation, there are only 3 transitions that still take 1% or more of the time:
transition_processing_memory
transition_waiting_processing
transition_memory_released
The last 2 spend a good chunk of time performing communication. So once communication is moved out of those transitions ( https://github.com/dask/distributed/pull/4343 ), would expect that cuts down the amount of time we spend in them.
Thank you @jakirkham for including these updates. It makes digesting these plots significantly easier!
+1 on the value of adding context
On Fri, Dec 11, 2020 at 7:37 AM Benjamin Zaitlen notifications@github.com wrote:
Thank you @jakirkham https://github.com/jakirkham for including these updates. It makes digesting these plots significantly easier!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/quasiben/dask-scheduler-performance/issues/51#issuecomment-743265693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTBSLAJKUPBOGRNY55LSUI4FPANCNFSM4UUZ25XQ .
The last 2 spend a good chunk of time performing communication. So once communication is moved out of those transitions ( https://github.com/dask/distributed/pull/4343 ), would expect that cuts down the amount of time we spend in them.
In case anyone is curious about how moving communication out of transitions affects the call graph, here's a call graph generated last night ( https://github.com/dask/distributed/pull/4343#issuecomment-742966192 ) with that change. Note that transition_memory_released
drops out with this change (it spends the bulk of its time communicating). So we are left with transition_processing_memory
and transition_waiting_processing
.
When you have a chance, would encourage you to look at benchmark in issue ( https://github.com/quasiben/dask-scheduler-performance/issues/25 ), the second profile with a call graph (first one had performance reports on), and compare it to the most recent one ( https://github.com/quasiben/dask-scheduler-performance/issues/52 ) with particular focus to transitions. Some high level observations:
transition
was taking 15% of the time now down to 12.5%transition_processing_memory
took 4% now about 3.36% (closer to 3% locally)transition_waiting_processing
took 3.63% now under 3%transition_memory_forgotten
has dropped from the profilereport
has dropped from the profile_propagate_forgotten
has dropped from the profileget_comm_cost
is no longer a significant blocker for transitions (still shows up in revaluate_occupancy
)
Historical Throughput
<img width="641" alt="Benchmark Image" src="https://raw.githubusercontent.com/quasiben/dask-scheduler-performance/benchmark-images/assets/dgx-20201210-benchmark-history.png">
Raw Data
<Client: 'tcp://127.0.0.1:44589' processes=10 threads=10, memory=540.94 GB> Distributed Version: 2.31.0.dev0+93.g997157aa simple 5.568e-01 +/- 3.972e-02 shuffle 2.192e+01 +/- 8.184e-01 rand_access 9.330e-03 +/- 2.706e-03 anom_mean 9.977e+01 +/- 1.126e+00
Raw Values
simple [0.58457732 0.53798223 0.53934216 0.56860733 0.53080273 0.62986851 0.54856706 0.58331561 0.4721806 0.57300425] shuffle [20.88524246 21.48666 21.38820171 20.97553229 21.89661932 21.72500539 22.17030573 23.77210212 22.1609633 22.74121404] rand_access [0.00762415 0.01075149 0.01181793 0.01014018 0.01083684 0.01109505 0.0042882 0.00463843 0.01221418 0.00989771] anom_mean [ 99.24389291 98.97223043 97.85002375 99.50107884 98.36659908 100.993783 100.84478521 99.75757003 101.18812919 100.94548392]
Dask Profiles
Scheduler Execution Graph
<img width="641" alt="Sched Graph Image" src="https://raw.githubusercontent.com/quasiben/dask-scheduler-performance/benchmark-images/assets/20201210-sched-graph.png">