Open rpanai opened 3 years ago
I am experiencing a very similar issue while using a Fargate ECS cluster in a context manager.
if __name__ == '__main__':
with ECSCluster(fargate_scheduler=True, fargate_workers=True, n_workers=20, image=sys.argv[1],
task_role_policies=['arn:aws:iam::aws:policy/AmazonS3FullAccess']) as cluster:
print(cluster.dashboard_link)
with Client(cluster) as client:
main(client)
Traceback (most recent call last):
File "/Users/colt/.pyenv/versions/3.9.6/lib/python3.9/weakref.py", line 656, in _exitfunc
f()
File "/Users/colt/.pyenv/versions/3.9.6/lib/python3.9/weakref.py", line 580, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File "/Users/colt/Library/Caches/pypoetry/virtualenvs/dummy-sOqOco3g-py3.9/lib/python3.9/site-packages/distributed/deploy/cluster.py", line 214, in sync
return sync(self.loop, func, *args, **kwargs)
File "/Users/colt/Library/Caches/pypoetry/virtualenvs/dummy-sOqOco3g-py3.9/lib/python3.9/site-packages/distributed/utils.py", line 286, in sync
raise RuntimeError("IOLoop is closed")
RuntimeError: IOLoop is closed
Does this happen after your work completes?
Yes the work completes as expected. I'm also able to use the cluster and client in a notebook without issue until I call client.close()
and cluster.close()
(at which point I get the same error as the original issue outlines).
Ok thanks. Given that all these tracebacks seem to point to distributed
I'm going to move this issue over there.
Also see the related issue: https://github.com/dask/distributed/issues/4950
I am also having this issue, I have tested with 2021.12.0 and 2022.01.1 and python3.7/8/9.
However, worth noting, I got the same raises through a wrapper (Prefect) using dask-cloudproviders (fargatecluster).
My code runs, results complete, then upon exit of process the IOLoop is closed runtime error is raised in scheduler from utils.py
https://github.com/PrefectHQ/prefect/issues/5330 - here is a detail of issue which leads to IOLoop also, using prefect and fargate cluster.
Sorry to tag, just wondering if any updates regarding this potential issue in distributed is getting any attention? @jacobtomlinson
Any updates here? This problems appears from scripts only but it works fine from jupyter/ipython.
When I close a Fargate cluster (similar to #220 ) using
I receive the following error