dask / dask-cloudprovider

Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...
https://cloudprovider.dask.org
BSD 3-Clause "New" or "Revised" License
134 stars 110 forks source link

Add ability to specify maxSwap for ECS clusters #59

Open gvelchuru opened 4 years ago

gvelchuru commented 4 years ago

It would be nice to specify swap memory, especially when dealing with very large DataFrames

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definitions (search for maxSwap)

jacobtomlinson commented 4 years ago

I would be happy to explore this but it is worth noting that Dask manages it's own memory swapping and doesn't use the built in system swap. Futures are moved in and out of memory into the temporary worker space on disk.

If you are having trouble with workers running out of memory this may be related to your dataframe partitioning.