PUDL Release v2022.11.30

zaneselvans commented 1 year ago

As soon as the nightly builds succeed on dev we'll be ready to merge into main and tag a new PUDL release that includes all 2021 data for all of our covered datasets.

Release Checklist / Notes

Since we want to automate the release process I'm trying to catalog everything I do here...

[x] Sync Zenodo archives to GCS cache for nightly builds
- [x] censusdp1tract
- [x] eia860
- [x] eia860m
- [x] eia923
- [x] eia_bulk_elec
- [x] epacamd_eia
- [x] epacems
- [x] ferc1
- [x] ferc2
- [x] ferc6
- [x] ferc60
- [x] ferc714
[x] After a passing nightly build, check that https://data.catalyst.coop is working as expected
[x] Check that the PUDL Intake Catalog tests pass with the new dev archive.
[x] Check that all the RMI stuff that depends on dev is working.
[x] Merge #1681 into main
[x] Update the data release build script in devtools/databeta.sh to use the new data (and pull it from GCS?)
[x] Re-run a nightly build that uses the official Arelle release.
[x] Package the official Arelle distribution & catalystcoop.ferc-xbrl-extractor on conda-forge. See this PR
[x] Tag the release on main, using CalVer (YYYY.MM.DD)
[x] Wait for the tagged build to complete successfully
[x] Release catalystcoop.pudl v2022.11.30 on PyPI.
[x] Convert tag to release on GitHub & upload PyPI distribution outputs.
[x] Manually update metadata for Zenodo archive of the PUDL GitHub repo.
[x] Release catalystcoop.pudl on conda-forge
[x] Update the PUDL Examples repo environment to use new software versions.
[x] Do a versioned release of the PUDL Intake Catalog to point at the newly released data version. Need to merge catalyst-cooperative/pudl-catalog#70 first.
[x] Update the PUDL Examples repo environment to install new version of pudl-catalog with the v2022.11.30 data.
[x] Update PUDL Examples notebooks to run locally with the newly released data + software versions.
[x] Build a PUDL data release tarball including Docker + Data with pudl-data-release.sh & upload it to Zenodo
[x] Update version of Docker container on the 2i2c JupyterHub
[x] Upload the newly released data to the 2i2c JupyterHub (downloaded from S3 release bucket)

jdangerx commented 1 year ago

@zaneselvans - seems like v2022.11.30 is released, at least on Zenodo. Are we good to merge dev into main, close this issue, etc? Do you still need to do 2i2c stuff?

Also, is this the process you want help streamlining?

zaneselvans commented 1 year ago

@jdangerx yes, this is a big part of the semi-manual release process that needs streamlining. The other big piece which @zschira has been looking at is on the data acquisition end with the pudl-archiver repository and #1418

I'm torn on the JupyterHub. If we're not going to update it, then we should remove it from the documentation. I do think some resource like this is / would be useful. I should just go ahead and update it. It should only take 10 minutes. I just ran out of steam.

jdangerx commented 1 year ago

Tada!

zaneselvans commented 1 year ago

However, I'm pretty sure we still need to upload the new data to the JupyterHub. If you attempt to run the example notebooks in the new Docker container I believe they'll fail, since the data on the hub is from the prior release.

But this should hopefully be much faster & easier now that I can pull it down directly from the S3 bucket without needing to do any authentication.

jdangerx commented 1 year ago

@zaneselvans how do we do that upload?

zaneselvans commented 1 year ago

I usually log in to the JuypterHub, open a terminal within JupyterLab, and download the files from wherever they are on the internet. Historically this has been from Zenodo which has been flaky and slow. But now that we've got the build outputs in a publicly accessible bucket with no authentication required, it should be much faster and easier. Still need to install the AWS CLI on the hub to do recursive downloads. Should probably add that to the Docker container rather than needing to do it manually.

I've got it downloaded to the hub now and am mopping up the old versions, and putting the files in the right places now.

zaneselvans commented 1 year ago

Okay, it's all updated now.

catalyst-cooperative / pudl

PUDL Release v2022.11.30 #2077

Release Checklist / Notes