Release 2023.5.0 - Githubissues

jrbourbeau commented 1 year ago


Release version	2023.5.0
Planned release date	2023-05-12
Status	On-track
Release manager	@jacobtomlinson

Best effort

Try to close before the release but will not block the release

[ ] ...
[ ] ...

Blocker

Issues that would cause us to block and postpone the release if not fixed

[ ] ...
[ ] ...

Comments

Note that @jacobtomlinson and @charlesbluca will be handling the release this week as I'll be OOO (thanks again for taking care of this)

cc @quasiben @rjzamora @fjetter

ncclementi commented 1 year ago

@jrbourbeau in the coiled-benchmarks there were some regressions spotted, it's not clear to me if this was expected or not or if they were solved, see:

@j-bennet you were looking at these cases right, do you have any more context here that you can add?

j-bennet commented 1 year ago

@ncclementi @jrbourbeau

https://github.com/coiled/benchmarks/issues/839 didn't look legit. There seemed to have been a hiccup writing to benchmarks.db, several records were duplicated on insert. I closed the issue.

However, CI opened a new one today:

https://github.com/coiled/benchmarks/issues/840

and that one may be legitimate, still investigating.

j-bennet commented 1 year ago

Ok, so in the new CI issue, runtime = 'coiled-upstream-py3.9' regressions look legitimate:

runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (csv)-p2p]', category = 'benchmarks', last_three_duration [s] = ([21](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:22).08663511276245, 23.024844884872437, [22](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:23).656970739364624), duration_threshold [s] = 20.351763563082486 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (csv)-tasks]', category = 'benchmarks', last_three_duration [s] = (20.7704176902771, [23](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:24).282063007354736, 21.78318214416504), duration_threshold [s] = 19.681029691350872 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = (36.947147607803345, 38.3483505[24](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:25)902344, 37.531134366989136), duration_threshold [s] = 27.49540470443585 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[0.5 GB (parquet)-tasks]', category = 'benchmarks', last_three_duration [s] = (36.0034384727478, 37.55315279960632, 37.223448038101196), duration_threshold [s] = 26.97645565716312 
runtime = 'coiled-upstream-py3.9', name = 'test_q8[5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = (184.075361[25](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:26)183105, 178.22539234161377, 177.2711117[26](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:27)76086), duration_threshold [s] = 144.0348346523647

The charts don't look very alarming to me. Zoomed in:

These spikes are similar to fluctuations we had in the past, and those resolved.

runtime = 'coiled-latest-py3.9' are still the same duplicate record issue, not legitimate, at least not yet:

runtime = 'coiled-latest-py3.9', name = 'test_q8[0.5 GB (csv)-p2p]', category = 'benchmarks', last_three_duration [s] = (23.0051052570343, 23.257978677749634, 23.257978677749634), duration_threshold [s] = 22.74642871273649 
runtime = 'coiled-latest-py3.9', name = 'test_q8[0.5 GB (csv)-tasks]', category = 'benchmarks', last_three_duration [s] = (22.663613319396973, 23.347721576690674, 23.347721576690674), duration_threshold [s] = 22.475621609149425 
runtime = 'coiled-latest-py3.9', name = 'test_q8[5 GB (parquet)-p2p]', category = 'benchmarks', last_three_duration [s] = ([20](https://github.com/coiled/benchmarks/actions/runs/4931685685/jobs/8814800607#step:5:21)3.9737629890442, 190.76829409599304, 190.76829409599304), duration_threshold [s] = 187.39081849451378

j-bennet commented 1 year ago

@fjetter @hendrikmakait should these block the release, can you advise?

hendrikmakait commented 1 year ago

I'm investigating.

jacobtomlinson commented 1 year ago

Thanks @hendrikmakait

hendrikmakait commented 1 year ago

The regression we see in the benchmarks is caused by a switch from pandas=1.5.3 to pandas=2.0.1 in the benchmarking environment, not a change since dask=2023.4.1. I've run an A/B test (https://github.com/coiled/benchmarks/actions/runs/4946428740) on 2023.4.1 confirming that this issue is already present in the previous release.

Screenshot 2023-05-11 at 13 22 13

I suggest moving forward with the release as planned.

jacobtomlinson commented 1 year ago

Thanks for confirming @hendrikmakait. I'm happy to move forward with the release in this case.

phofl commented 1 year ago

The default value of group_keys changed from False to True which caused the regression.

jacobtomlinson commented 1 year ago

Starting the release now

jacobtomlinson commented 1 year ago

dask and distributed 2023.5.0 are now on PyPI. @charlesbluca is going to handle the bot-triggered actions on conda-forge and dask-docker.

charlesbluca commented 1 year ago

Release on conda-forge is complete:

Currently handling the docker release:

https://github.com/dask/dask-docker/pull/279

jacobtomlinson commented 1 year ago

Closing as complete

mrocklin commented 1 year ago

Thanks Jacob, Charles, and others for handling the release this week.

On Mon, May 15, 2023 at 8:03 AM Jacob Tomlinson @.***> wrote:

Closing as complete

— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/322#issuecomment-1547822073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTADUVM6KCTUC4UXW6LXGISSLANCNFSM6AAAAAAX3QH72Y . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jacobtomlinson commented 1 year ago

@quasiben has noticed that https://docs.dask.org/en/stable/changelog.html has not been updated with the latest release.

I'll reopen this while we look into it.

jacobtomlinson commented 1 year ago

It looks like the docs build failed as it couldn't find the 2023.5.0 release on PyPI. This was likely a race between me pushing the tag to GitHub and pushing the release to PyPI.

I don't see any obvious button in RTD to re-run the build. @jrbourbeau have you run into this before?

martindurant commented 1 year ago

You should be able o run a RTD build any time by going to the project page and going to "builds".

jacobtomlinson commented 1 year ago

Ah yeah, thanks @martindurant. I was looking for a "rerun" button on the failed builds. I've triggered a new build for stable and latest.

jacobtomlinson commented 1 year ago

Ah found another problem. I missed a user link in the changelog. @jrbourbeau did warn me about this. I'll get it resolved now.

jacobtomlinson commented 1 year ago

This is now resolved. Apologies for the noise.

jrbourbeau commented 1 year ago

Thanks @jacobtomlinson @charlesbluca for handling this release!

dask / community

Release 2023.5.0 #322

Best effort

Blocker

Comments