dask / community

For general discussion and community planning. Discussion issues welcome.
19 stars 3 forks source link

Patch Release for 2023.3.2 #316

Closed quasiben closed 1 year ago

quasiben commented 1 year ago

I'd like to ask for help creating a patch release for 2023.3.2: 2023.3.2.1 -- RAPIDS recently discovered a critical bug for us relating to clean up handling xref: https://github.com/dask/distributed/issues/7726 . If possible we'd like to get a release out patching 2023.3.2 which only includes a revert of https://github.com/dask/distributed/pull/7644 . So we'd keep the code in main as is and use https://github.com/dask/distributed/issues/7726 to discuss a better fix

cc @jrbourbeau @charlesbluca @jakirkham

quasiben commented 1 year ago

I'm realizing again that Dask has not done patch releases so maybe this becomes 2023.3.3 or 2023.4.1 ?

wence- commented 1 year ago

To be clear, we don't actually need to revert dask/distributed#7644. We're still trying to figure out exactly what is going on, but we just need a no-op weakref finalizer to be set up before the atexit handler for close_cluster. For example this:

diff --git a/distributed/__init__.py b/distributed/__init__.py
index d8e20f36..eb8ada7d 100644
--- a/distributed/__init__.py
+++ b/distributed/__init__.py
@@ -7,7 +7,9 @@ from distributed import widgets  # load distributed widgets second
 # isort: on

 import atexit
+import weakref

+weakref.finalize(lambda: None, lambda: None)
 import dask
 from dask.config import config  # type: ignore
charlesbluca commented 1 year ago

I'm realizing again that Dask has not done patch releases so maybe this becomes 2023.3.3 or 2023.4.1?

From discussions around the versioning of the nightlies, I seem to recall our dask/distributed conda pinnings being set such that a patch release could be done without causing issues; for example, dask's pinnings look like:

dask-core >=2023.3.2,<2023.3.3.0a0
distributed >=2023.3.2,<2023.3.3.0a0

Which should allow for 2023.3.2.1 if need be. That being said, I suppose in this case since we haven't yet done a subsequent release, there really isn't anything stopping us from just pushing this patch into another standard Dask release and releasing dask-core/dask as well (though the changes there would be minimal/none). Ultimately would leave this decision to the core maintainers on what seems like the easiest / most sensible release method here.

EDIT:

May have spoken too soon - I notice that Dask's optional distributed dependency is pinned explicitly:

https://github.com/dask/dask/blob/f607cd61bf5c91abbca9d2d821cb1532ca67843e/pyproject.toml#L54

Which I think means that doing pip install dask[distributed]==2023.3.2 wouldn't pick up the latest patch version; if we wanted things to be consistent between conda and pip here, we would probably need to do another standard Dask release.

jrbourbeau commented 1 year ago

Apologies, I've been in meetings all morning. I'm also fairly busy with other things today, so don't have a ton of extra bandwidth. What input is needed from folks like me and @fjetter? Based on @fjetter's comment here https://github.com/dask/distributed/issues/7726#issuecomment-1491650153 it sounds like he's okay with temporary patches. FWIW I'm also okay with a patch release that reverts https://github.com/dask/distributed/pull/7644 or adds a no-op finalizer if that helps.

I'd appreciate an outline of what steps are needed, but overall I trust folks like @quasiben @jakirkham to handle issuing a patch release. Are there specific issues around publishing a patch release? Or is the question "Is a patch release okay?" -- if so, I think the answer is "yes, a patch release is okay"

charlesbluca commented 1 year ago

Are there specific issues around publishing a patch release?

Think my only question around that is if we would expect/want pip install dask[distributed]==2023.3.2 to implicitly pick up the Distributed patch version? If so, then that might not be possible currently and we might want to lean towards just doing a standard Dask release.

If this isn't a big deal (think this can be worked around with pip install dask==2023.3.2 distributed==2023.3.2.1), then the patch release would probably make more sense here

jrbourbeau commented 1 year ago

Think my only question around that is if we would expect/want pip install dask[distributed]==2023.3.2 to implicitly pick up the Distributed patch version?

Good question. I'm not sure how dasks pinning of distributed == 2023.3.2 works with pip and patches. I know something like pip install distributed==2023.3 will pick up the latest minor number (currently 2023.3.2). I'm not sure about going one level deeper for patch numbers though. Maybe @jakirkham has insight here.

Also, what's the timeline for this for the RAPIDS release? I (understandably) sense some time pressure, but I'm not actually sure about any specifics

martindurant commented 1 year ago

(you would want to make sure that the pinning does the right thing with conda also)

charlesbluca commented 1 year ago

I know something like pip install distributed==2023.3 will pick up the latest minor number (currently 2023.3.2)

Trying this out I end up picking up distributed 2022.3.0, not sure if this varies by pip version:

$ pip --version
pip 23.0.1 from /datasets/charlesb/mambaforge/envs/pip-test/lib/python3.10/site-packages/pip (python 3.10)

$ pip install distributed==2023.3 --dry-run
...
Would install ... dask-2023.3.0 distributed-2023.3.0 ...

Following this behavior, I would assume pip install distributed==2023.3.2 would install distributed 2023.3.2[.0]

(you would want to make sure that the pinning does the right thing with conda also)

Think we should be good on the conda end of things, the dask-core/distributed pinnings I listed above allow for and prioritize patch releases of a given Dask release

jakirkham commented 1 year ago

From PEP 440 see the following example, it is not identical, but close

For example, given the version 1.1.post1, the following clauses would match or not as shown:

== 1.1        # Not equal, so 1.1.post1 does not match clause
== 1.1.post1  # Equal, so 1.1.post1 matches clause
== 1.1.*      # Same prefix, so 1.1.post1 matches clause

Thinking this means distributed == 2023.3.2 is not a match for 2023.3.2.1, but distributed == 2023.3.2.* would be.

Though maybe we can try a test to confirm (like distributed == 2023.3 and distributed == 2023.3.*)

charlesbluca commented 1 year ago

Though maybe we can try a test to confirm (like distributed == 2023.3 and distributed == 2023.3.*)

$ pip install distributed==2023.3.* --dry-run
...
Would install ... dask-2023.3.2 distributed-2023.3.2 ...
vyasr commented 1 year ago

With PEP 440 versioning, == is a strict equality aside from padding zeros. So ==1.1 will not match 1.1.post1, but it will match 1.1.0.0.... Specifically, see this clause:

The only substitution performed is the zero padding of the release segment to ensure the release segments are compared with the same length.

Adding the .* suffix will allow the matches that we want. If we need to ensure that we don't get a lower match, we could also use ~=2023.3.2, which is equivalent to >=2023.3.2,==2023.3.*. However, I'm not sure we want to float forward even on patch releases.

jakirkham commented 1 year ago

Thanks Charles! 🙏

What do we get if there is no .* at the end?

charlesbluca commented 1 year ago

What do we get if there is no .* at the end?

From this previous comment, looks like we would end up pulling dask/distributed 2023.3.0, which is in-line with @vyasr's comment

jrbourbeau commented 1 year ago

Trying this out I end up picking up distributed 2022.3.0

Ah, my mistake. Thanks for clarifying. I just checked and conda behaves the same as pip here.

What constraints does RAPIDS have for the dask and distributed version? It sounds like there's a hard pin to dask=2023.3.2 -- is that correct? Does patching dask to update its distributed pinning (e.g. adding .*) help?

charlesbluca commented 1 year ago

What constraints does RAPIDS have for the dask and distributed version?

Typically, Dask-dependent RAPIDS projects do a hard pin of both dask and distributed, though for this release we'll either want to pin distributed to the patched version or relax our pinning to 2023.3.2.* to allow the latest patch version to be picked up.

Does patching dask to update its distributed pinning (e.g. adding .*) help?

I don't think this should be necessary, as IIUC this pinning only controls what version of distributed gets installed when running pip install dask[distributed], and doesn't actually constrain what version of distributed can be installed.

quasiben commented 1 year ago

Thank you @jrbourbeau for engaging with us. I want to note that we are not so pressed that we need to get this done today/this weekend and will work towards a solution for early next week

charlesbluca commented 1 year ago

To bump this thread, we’ve generally come to an agreement that a patch release on Distributed is the best approach here - ideally, we would like this to happen either tomorrow or Wednesday (April 4th or 5th, respectively) in time for RAPIDS burn down. In particular, the patch we have in mind is contained here; this is essentially @wence-'s diff from above committed on top of the 2023.3.2 tag. Once this is done, RAPIDS would do the same Dask/Distributed pinning process as usual, using distributed 2023.3.2.1 in the place of 2023.3.2.

I’m happy to handle the conda-forge / Docker (if any) aspects of this release, though I’m not too familiar with Dask’s process of releasing to PyPI (particularly as it concerns patch releases). @quasiben can probably do this work, though @jrbourbeau if you have time to do this that would be extremely helpful.

One quirk to point out is that since Dask’s optional distributed dependency hasn’t been loosened to include follow-up patch versions, doing something like:

pip install dask[distributed]==2023.3.2

won’t pick up the patch release. However, given that none of the downstream RAPIDS projects specify their Dask/Distributed dependencies this way and this is a relatively niche fix, I’m willing to believe that this shouldn’t be much of an issue, and we can loosen these pinnings in follow up PRs. However, if this is an issue to any maintainers, we would also need to update the Dask/Distributed pinnings on one another (and likely do a Dask patch release).

mrocklin commented 1 year ago

I'm curious, we're planning on releasing on Friday anyway I would guess, right? Are you all ok waiting a couple of days? This seems strange to me.

quasiben commented 1 year ago

Unfortunately we can't wait until the next release -- we still need this patch as we haven't thoroughly tested the recent serialization PRs which will be in the next release

mrocklin commented 1 year ago

This still seems strange to me. If other downstream projects asked for something like this we'd almost certainly say no.

The RAPIDS team has a special status, certainly. If the RAPIDS team plans to use the Dask release process in this way then I'd like to make sure that the RAPIDS team also handles any fallout from this action. Presumably this means that you would be tracking the relevant issue trackers for a decent time in the future, identifying any issues that might occur from this, and fixing those issues as your top priority.

Even then, if I were release manager here I'd probably say "no". Custom patches like this are typically handled downstream. This seems peculiear. Lucky for you though I'm not the release manager, and so this is really up to @jrbourbeau .

quasiben commented 1 year ago

You already communicated your concern around fallout to me privately and I agreed then and still agree now.

In the past you communicated that releases were cheap and that if there was a significant issue we could ask for a release. This is significant issue for us. We understood there was social capital being spent in asking and we tried (in vain) to find a solution outside of Dask but sadly came up short.

If you believe this to be untenable we (RAPIDS) can probably take an alternate route to releasing a version in the RAPIDS channel

mrocklin commented 1 year ago

You already communicated your concern around fallout to me privately and I agreed then and still agree now.

đź‘Ť The reaason I'm doubling down here is that this no longer seems like getting a release out a week early. I'm now realizing that you want something that's actually outside of the mainline history. This feels more like the patches that Anaconda applies to their distribution, than like the development of a library. I haven't seen this pattern done before.

This is significant issue for us

I acknowledge that you're in pain. Maybe it would be good to learn things like why you couldn't pin away from recent releases (this is what the RAPIDS group has done previously). Probably you've already thought of this though.

If you believe this to be untenable we (RAPIDS) can probably take an alternate route to releasing a version in the RAPIDS channel

This seems more normal to me than asking the upstream project to include the patch.

However, @jrbourbeau is in charge here, not me. If he's ok with this then that's all you need.

jrbourbeau commented 1 year ago

If other downstream projects asked for something like this we'd almost certainly say no. The RAPIDS team has a special status, certainly

I agree with both these statements. Given that, I'm okay with doing a patch release this time to help out. That said, I wouldn't want to see this be a frequent occurrence (we're far from this being frequent, I do want to be explicit though).

If the RAPIDS team plans to use the Dask release process in this way then I'd like to make sure that the RAPIDS team also handles any fallout from this action You...communicated your concern around fallout to me..and I agreed

Also +1 for this. Thank Matt for bringing it up and Ben for making sure this tracking/fixing happens.

In particular, the patch we have in mind is contained here

Is this the only change? Are any changes needed in the dask conda feedstock?

EDIT: I would also ask that before doing any patch release, folks confirm the dask and distributed test suites pass with the suggested patch.

charlesbluca commented 1 year ago

Is this the only change? Are any changes needed in the dask conda feedstock?

Yeah, this is the only change that should be necessary - the dask conda packages are pinned such that the patch release should get picked up automatically with a conda install. As noted above, this does contrast with the behavior of pip install dask[distributed], but getting those install commands consistent would require altering Dask's optional distributed pinning - from my perspective, I assume this inconsistency will be low impact, and would be better handled moving forward in an official release (happy to take on ownership of this task). That being said, I'll leave the decision of how we want to handle those pinnings up to you and the other core maintainers.

folks confirm the dask and distributed test suites pass with the suggested patch.

Happy to do this - assuming we want to ensure that tests are passing both for 2023.3.2 and main? I'll try to get things running in a PR checked out against the release tags so that we can keep things on GitHub, failing that I'll just run the tests locally and report back here.

charlesbluca commented 1 year ago

Was able to open PRs targeting 2023.3.2 (along with the patch discussed above):

So far doesn't seem like any failures stand out to me stemming from the patch, just resolving some pandas 2.0 failures in the Distributed PR to get confirmation

charlesbluca commented 1 year ago

Is this the only change? Are any changes needed in the dask conda feedstock?

Yeah, this is the only change that should be necessary

Want to walk back on this statement - after doing some local testing around the conda recipe for Distributed, I realize that we'll need to make some modifications to how we pin dask-core to allow for older patch releases than the one we're currently build for, to avoid build errors of the form:

nothing provides requested dask-core 2023.3.2.1.*

Essentially, we want the pinning to look something like this:

{% set version = "2023.3.2.1" %}

package:
  name: distributed
  version: {{ version }}

source:
  url: https://pypi.io/packages/source/d/distributed/distributed-{{ version }}.tar.gz
  ...

requirements:
  host:
    - dask-core 2023.3.2
    ...
  run:
    - {{ pin_compatible('dask-core', max_pin='x.x.x') }}
    ...

I imagine this shouldn't be too difficult to do with Jinja, but apologize for giving an incorrect answer earlier.

jrbourbeau commented 1 year ago

FYI I just synced with @quasiben and @charlesbluca offline. They're confident in the proposed patch and are going to start pushing out distributed=2023.3.2.1

quasiben commented 1 year ago

Thanks @jrbourbeau !

Our plan is as follows:

jrbourbeau commented 1 year ago

Looks like 2023.3.2.1 is out. Anything left to do, or is this okay to close?

charlesbluca commented 1 year ago

Think this should be good to close, some follow up work to consider (either as it's needed or proactively):

jrbourbeau commented 1 year ago

Sounds good -- I'll go ahead and close then. Noted about the follow-up work

Thanks @quasiben @charlesbluca for handling this

quasiben commented 1 year ago

Thank you @jrbourbeau !