Closed jrbourbeau closed 1 year ago
This work around https://github.com/dask/distributed/commit/d62693de90dc2161a7b4cb47e4b63f6815f8d852 that @quasiben added for 2023.3.2.1 has not been replicated in main; so for next release I believe we either need to add the hack to main too or properly resolve https://github.com/dask/distributed/issues/7726 ?
Thanks for bring this up @crusaderky . @wence- has been diving quite deep into Python object handling at exit. My understanding is that he is finding the path to be extremely fragile (as many of us know) -- I'll let him chime in on https://github.com/dask/distributed/issues/7726 with current thoughts but it is looking like this a much longer solution.
For the short-/mid-term, we would really appreciate it if that fix could be added in Distributed, unfortunately we do not have a way to mitigate this issue another way. We will continue to explore potential solutions, but for the time being we are blocked without it.
I'll let him chime in on https://github.com/dask/distributed/issues/7726 with current thoughts but it is looking like this a much longer solution.
I do not think this is solvable before next release in a way that would be robust. I cannot yet say if I think it is possible to solve without making changes (possibly undesired) to the way resource management is done. IOW, can't give a timeline yet.
I think Coiled would appreciate getting in https://github.com/dask/dask/pull/10159.
@galipremsagar mentioned there was a failure in dask-cudf
that due to a recent (unreleased) change in dask/dask
. https://github.com/dask/dask/pull/10182 fixes and we should include it in this release
This work around https://github.com/dask/distributed/commit/d62693de90dc2161a7b4cb47e4b63f6815f8d852 that @quasiben added for 2023.3.2.1 has not been replicated in main
Just following up here, my understanding is that RAPIDS pins tightly to dask
and distributed
. The distributed==2023.3.2.1
patch release helped their latest release that just went out (so things are good for the moment). We'll need a long-term solution, but that won't be necessary until the next RAPIDS release, which is ~2 month away. @quasiben does that sound right?
That's correct for the release. If it looks like early next week the long term solution is going to take quite awhile then I'll submit PR which looks similar to the patch in 2023.3.2.1
Okay, looks like all blockers and best effort issues/PRs are in. I'll start pushing the release out in ~30 minutes
@jrbourbeau Can you merge this PR after release and maybe put out a tweet to highlight it?
https://github.com/dask/dask-blog/pull/163
@fjetter, @rjzamora and I have iterated on it a couple of times already but if you have any nits or other feedback please don't hesitate to push to the PR before merging.
Yeah, no problem 👍
Closing as 2023.4.0 is out. Thanks all
https://docs.dask.org/en/stable/changelog.html looks like the changelog didn't update for some reason?
Thanks @bnaul -- things are fixed now
Best effort
Try to close before the release but will not block the release
Blocker
Issues that would cause us to block and postpone the release if not fixed
Comments
cc @fjetter @rjzamora @crusaderky @quasiben @jakirkham for visibility
Note I'm proposing we push back the release a week as most Dask folks at Coiled and several at NVIDIA will be out next Friday and the following Monday for holiday / PTO. Based on my current understanding, I don't think this should be a big deal.