dask / community

For general discussion and community planning. Discussion issues welcome.
19 stars 3 forks source link

Release 2023.5.0 #322

Closed jrbourbeau closed 1 year ago

jrbourbeau commented 1 year ago
Release version 2023.5.0
Planned release date 2023-05-12
Status On-track
Release manager @jacobtomlinson

Best effort

Try to close before the release but will not block the release

Blocker

Issues that would cause us to block and postpone the release if not fixed

Comments

Note that @jacobtomlinson and @charlesbluca will be handling the release this week as I'll be OOO (thanks again for taking care of this)

cc @quasiben @rjzamora @fjetter

ncclementi commented 1 year ago

@jrbourbeau in the coiled-benchmarks there were some regressions spotted, it's not clear to me if this was expected or not or if they were solved, see:

@j-bennet you were looking at these cases right, do you have any more context here that you can add?

j-bennet commented 1 year ago

@ncclementi @jrbourbeau

https://github.com/coiled/benchmarks/issues/839 didn't look legit. There seemed to have been a hiccup writing to benchmarks.db, several records were duplicated on insert. I closed the issue.

However, CI opened a new one today:

https://github.com/coiled/benchmarks/issues/840

and that one may be legitimate, still investigating.

j-bennet commented 1 year ago

Ok, so in the new CI issue, runtime = 'coiled-upstream-py3.9' regressions look legitimate:

runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (csv)-p2p]', category = 'benchmarks', last_three_duration [s] = ([21](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:22).08663511276245, 23.024844884872437, [22](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:23).656970739364624), duration_threshold [s] = 20.351763563082486 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (csv)-tasks]', category = 'benchmarks', last_three_duration [s] = (20.7704176902771, [23](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:24).282063007354736, 21.78318214416504), duration_threshold [s] = 19.681029691350872 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = (36.947147607803345, 38.3483505[24](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:25)902344, 37.531134366989136), duration_threshold [s] = 27.49540470443585 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (parquet)-tasks]', category = 'benchmarks', last_three_duration [s] = (36.0034384727478, 37.55315279960632, 37.223448038101196), duration_threshold [s] = 26.97645565716312 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = (184.075361[25](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:26)183105, 178.22539234161377, 177.2711117[26](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:27)76086), duration_threshold [s] = 144.0348346523647

The charts don't look very alarming to me. Zoomed in:

image

image

These spikes are similar to fluctuations we had in the past, and those resolved.

runtime = 'coiled-latest-py3.9' are still the same duplicate record issue, not legitimate, at least not yet:

runtime = 'coiled-latest-py3.9', name = 'test_q8[0.5 GB (csv)-p2p]', category = 'benchmarks', last_three_duration [s] = (23.0051052570343, 23.257978677749634, 23.257978677749634), duration_threshold [s] = 22.74642871273649 
runtime = 'coiled-latest-py3.9', name = 'test_q8[0.5 GB (csv)-tasks]', category = 'benchmarks', last_three_duration [s] = (22.663613319396973, 23.347721576690674, 23.347721576690674), duration_threshold [s] = 22.475621609149425 
runtime = 'coiled-latest-py3.9', name = 'test_q8[5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = ([20](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:21)3.9737629890442, 190.76829409599304, 190.76829409599304), duration_threshold [s] = 187.39081849451378 
j-bennet commented 1 year ago

@fjetter @hendrikmakait should these block the release, can you advise?

hendrikmakait commented 1 year ago

I'm investigating.

jacobtomlinson commented 1 year ago

Thanks @hendrikmakait

hendrikmakait commented 1 year ago

The regression we see in the benchmarks is caused by a switch from pandas=1.5.3 to pandas=2.0.1 in the benchmarking environment, not a change since dask=2023.4.1. I've run an A/B test (https://github.com/coiled/benchmarks/actions/runs/4946428740) on 2023.4.1 confirming that this issue is already present in the previous release.

Screenshot 2023-05-11 at 13 22 13

I suggest moving forward with the release as planned.

jacobtomlinson commented 1 year ago

Thanks for confirming @hendrikmakait. I'm happy to move forward with the release in this case.

phofl commented 1 year ago

The default value of group_keys changed from False to True which caused the regression.

jacobtomlinson commented 1 year ago

Starting the release now

jacobtomlinson commented 1 year ago

dask and distributed 2023.5.0 are now on PyPI. @charlesbluca is going to handle the bot-triggered actions on conda-forge and dask-docker.

charlesbluca commented 1 year ago

Release on conda-forge is complete:

Currently handling the docker release:

jacobtomlinson commented 1 year ago

Closing as complete

mrocklin commented 1 year ago

Thanks Jacob, Charles, and others for handling the release this week.

On Mon, May 15, 2023 at 8:03 AM Jacob Tomlinson @.***> wrote:

Closing as complete

— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/322#issuecomment-1547822073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTADUVM6KCTUC4UXW6LXGISSLANCNFSM6AAAAAAX3QH72Y . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jacobtomlinson commented 1 year ago

@quasiben has noticed that https://docs.dask.org/en/stable/changelog.html has not been updated with the latest release.

I'll reopen this while we look into it.

jacobtomlinson commented 1 year ago

It looks like the docs build failed as it couldn't find the 2023.5.0 release on PyPI. This was likely a race between me pushing the tag to GitHub and pushing the release to PyPI.

I don't see any obvious button in RTD to re-run the build. @jrbourbeau have you run into this before?

martindurant commented 1 year ago

You should be able o run a RTD build any time by going to the project page and going to "builds".

jacobtomlinson commented 1 year ago

Ah yeah, thanks @martindurant. I was looking for a "rerun" button on the failed builds. I've triggered a new build for stable and latest.

jacobtomlinson commented 1 year ago

Ah found another problem. I missed a user link in the changelog. @jrbourbeau did warn me about this. I'll get it resolved now.

jacobtomlinson commented 1 year ago

This is now resolved. Apologies for the noise.

jrbourbeau commented 1 year ago

Thanks @jacobtomlinson @charlesbluca for handling this release!