Closed m-fila closed 3 weeks ago
Could you try on master? I believe this was fixed in #532.
Thank you. I tried master, the error is gone but the warnings are still there
Yeah I think the warnings will have to stay, unless we bring back Dagger.cleanup()
for users to explicitly cleanup things. They can be safely ignored though, so I'll close this.
If those warnings are happening during a clean Julia shutdown, then we need to improve our fault tolerance logic to properly detect a clean shutdown and thus not emit these warnings, since they're quite scary to see. @m-fila can you confirm that these occur during a Julia exit?
Yes, I confirm
Ok, then re-opening this issue since we need to properly silence these warnings.
@m-fila can you please validate that https://github.com/JuliaParallel/Dagger.jl/pull/537 makes the warnings go away for you? It works for me locally.
Yes, they are gone with #537. Thanks!
The warnings still appear tho if the workers are removed workers() |> rmprocs
Yeah, that's a separate issue, because in this case Dagger has no idea that it was intentional for the workers to exit (Distributed.jl doesn't communicate this distinction to Dagger). You would need to call Dagger.rmprocs!(Dagger.Sch.eager_context(), workers())
before calling rmprocs
to allow Dagger time to properly clean up the workers.
Adding extra processes and scheduling with eager API seems to be producing error and warnings about reschduling do to workers dying. For example, snippet taken from README:
Gives the following error:
The error sometimes is omitted but warnings about workers dying are present. If lazy API is used then there are no warnings or errors The warnings seems to be harmless since they appear only while finishing the job
versioninfo:
Dagger: 0.18.11 I couldn't find any duplicates