dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 719 forks source link

Speed up ``Client.map`` by computing ``token`` only once for ``func`` and ``kwargs`` #8855

Closed fjetter closed 2 months ago

fjetter commented 2 months ago

tokenization is unfortunately expensive at times. Especially if we're iterating over a large collection this can be expensive.

Note: The tokenize(tok, args) is also expensive but mostly because of the ctxmanager foo we're doing in tokenize so calling it often is expensive. We should think about a fast path for this

github-actions[bot] commented 2 months ago

Unit Test Results

_See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests._

    25 files  ±0      25 suites  ±0   10h 14m 39s :stopwatch: + 5m 12s  4 122 tests ±0   4 007 :white_check_mark: +1    111 :zzz: ±0  4 :x:  - 1  47 615 runs  ±0  45 503 :white_check_mark: +2  2 108 :zzz: ±0  4 :x:  - 2 

For more details on these failures, see this check.

Results for commit a86d7397. ± Comparison against base commit 2e61816b.