quasiben / dask-scheduler-performance

BSD 3-Clause "New" or "Revised" License
2 stars 5 forks source link

DGX Nightly Benchmark run 20201210 #51

Open quasiben opened 3 years ago

quasiben commented 3 years ago

Historical Throughput

<img width="641" alt="Benchmark Image" src="https://raw.githubusercontent.com/quasiben/dask-scheduler-performance/benchmark-images/assets/dgx-20201210-benchmark-history.png">

Raw Data

<Client: 'tcp://127.0.0.1:44589' processes=10 threads=10, memory=540.94 GB> Distributed Version: 2.31.0.dev0+93.g997157aa simple 5.568e-01 +/- 3.972e-02 shuffle 2.192e+01 +/- 8.184e-01 rand_access 9.330e-03 +/- 2.706e-03 anom_mean 9.977e+01 +/- 1.126e+00

Raw Values

simple [0.58457732 0.53798223 0.53934216 0.56860733 0.53080273 0.62986851 0.54856706 0.58331561 0.4721806 0.57300425] shuffle [20.88524246 21.48666 21.38820171 20.97553229 21.89661932 21.72500539 22.17030573 23.77210212 22.1609633 22.74121404] rand_access [0.00762415 0.01075149 0.01181793 0.01014018 0.01083684 0.01109505 0.0042882 0.00463843 0.01221418 0.00989771] anom_mean [ 99.24389291 98.97223043 97.85002375 99.50107884 98.36659908 100.993783 100.84478521 99.75757003 101.18812919 100.94548392]

Dask Profiles

Scheduler Execution Graph

<img width="641" alt="Sched Graph Image" src="https://raw.githubusercontent.com/quasiben/dask-scheduler-performance/benchmark-images/assets/20201210-sched-graph.png">

jakirkham commented 3 years ago

It's worth noting that transition_released_forgotten has dropped out of the call graph. That transition is still around. It just now takes less than 0.5% of the time so has dropped out of the picture. As all the other transitions take 1% or more, it should be safe to ignore transition_released_forgotten.

jakirkham commented 3 years ago

To add to that observation, there are only 3 transitions that still take 1% or more of the time:

The last 2 spend a good chunk of time performing communication. So once communication is moved out of those transitions ( https://github.com/dask/distributed/pull/4343 ), would expect that cuts down the amount of time we spend in them.

quasiben commented 3 years ago

Thank you @jakirkham for including these updates. It makes digesting these plots significantly easier!

mrocklin commented 3 years ago

+1 on the value of adding context

On Fri, Dec 11, 2020 at 7:37 AM Benjamin Zaitlen notifications@github.com wrote:

Thank you @jakirkham https://github.com/jakirkham for including these updates. It makes digesting these plots significantly easier!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/quasiben/dask-scheduler-performance/issues/51#issuecomment-743265693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTBSLAJKUPBOGRNY55LSUI4FPANCNFSM4UUZ25XQ .

jakirkham commented 3 years ago

The last 2 spend a good chunk of time performing communication. So once communication is moved out of those transitions ( https://github.com/dask/distributed/pull/4343 ), would expect that cuts down the amount of time we spend in them.

In case anyone is curious about how moving communication out of transitions affects the call graph, here's a call graph generated last night ( https://github.com/dask/distributed/pull/4343#issuecomment-742966192 ) with that change. Note that transition_memory_released drops out with this change (it spends the bulk of its time communicating). So we are left with transition_processing_memory and transition_waiting_processing.

jakirkham commented 3 years ago

When you have a chance, would encourage you to look at benchmark in issue ( https://github.com/quasiben/dask-scheduler-performance/issues/25 ), the second profile with a call graph (first one had performance reports on), and compare it to the most recent one ( https://github.com/quasiben/dask-scheduler-performance/issues/52 ) with particular focus to transitions. Some high level observations: