Closed gsakkis closed 1 week ago
Thanks for the issue @gsakkis. From the scatter
API docs it looks you need to disable active memory management (https://distributed.dask.org/en/latest/active_memory_manager.html#enabling-the-active-memory-manager), specifically the reduce replicas policy, to use broadcast=True
. Here's a toy example
In [1]: from distributed import Client
In [2]: c = Client(processes=True)
In [3]: c.amm.stop()
In [4]: f = c.scatter(123, broadcast=True)
In [5]: f
Out[5]: <Future: finished, type: int, key: int-4951b9977632a52fcd6f0cc65c57bb33>
In [6]: c.who_has()
Out[6]:
{'int-4951b9977632a52fcd6f0cc65c57bb33': ('tcp://127.0.0.1:55535',
'tcp://127.0.0.1:55544',
'tcp://127.0.0.1:55539',
'tcp://127.0.0.1:55538')}
Hi @jrbourbeau, I missed the note about the AMM. Looks like it works as documented, thanks!
Describe the issue:
Running
client.scatter
withbroadcast=True
does not broadcast the data to all workers, it sends them to just one worker.Minimal Complete Verifiable Example:
Environment: