catalyst-cooperative / pudl-archiver

A tool for capuring snapshots of public data sources and archiving them on Zenodo for programmatic use.
MIT License
4 stars 1 forks source link

Publish July 1st 2024 archives #369

Closed github-actions[bot] closed 3 months ago

github-actions[bot] commented 3 months ago

Summary of results:

See the job run logs and results here.

Review and publish archives

For each of the following archives, find the run status in the Github archiver run. If validation tests pass, manually review the archive and publish. If no changes detected, delete the draft. If changes are detected, manually review the archive following the guidelines in step 3 of README.md, then publish the new version. Then check the box here to confirm publication status, adding a note on the status (e.g., "v1 published", "no changes detected, draft deleted"):

- [x] eia176 - No changes, draft deleted.
- [x] eia191 - V8.0.0 published
- [x] eia757a - No changes, draft deleted
- [x] eia860 - No changes, draft deleted
- [x] eia861 - No changes, draft deleted
- [x] eia923 - Note no 2023 ER data out yet, v 18.0.0 published
- [x] eia930 - v 6.0.0 published
- [x] eiawater - No changes, draft deleted
- [x] eia_bulk_elec - v 10.0.0 published
- [x] epacamd_eia - No changes, draft deleted
- [x] mshamines - v 7.0.0 published
- [x] phmsagas - Note new file date for 2004-9 data, also observed in raw downloaded data. v 7.0.0 published
- [x] epacems - v 11.0.0 published

Validation failures

For each run that failed because of validation test failures (seen in the GHA logs), add it to the tasklist. Download the run summary JSON by going into the "Upload run summaries" tab of the GHA run for each dataset, and follow the link. Investigate the validation failure.

If the validation failure is deemed ok after manual review (e.g., Q2 of 2024 data doubles the size of a file that only had Q1 data previously, but the new data looks as expected), go ahead and approve the archive and leave a note explaining your decision in the task list.

If the validation failure is blocking (e.g., file format incorrect, whole dataset changes size by 200%), make an issue to resolve it.

- [x] eia860m - "Individual file size test. The following files have absolute changes in file size >|25%|: {'eia860m-2024.zip': 0.25715735405929413}". Verified that expected data is included, published v22.0.0

Other failures

For each run that failed because of another reason (e.g., underlying data changes, code failures), create an issue describing the failure and take necessary steps to resolve it.

- [ ] eiaaeo - Concept DOI marked deleted in Zenodo API. When running successfully, no changes in raw data. Draft archive deleted. See #370
- [x] ferc1 - https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9748950183/job/26905078338 rerun after #362, v 14.0.0 published
- [x] ferc2 - https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9748950183/job/26905078670 rerun after #362, v 9.0.0 published
- [x] ferc6 - https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9748950183/job/26905079006 rerun after #362, v 6.0.0 published
- [x] ferc60 - https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9748950183/job/26905079371 rerun after #362, v 7.0.0 published
- [x] ferc714 - https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9748950183/job/26905079714 rerun after #362, v 10.0.0 published
- [x] nrelatb - Concept DOI marked deleted in Zenodo API. Temporarily pointed to last DOI, fixed issue and published v2.0.0
zaneselvans commented 3 months ago

Wow nice turnaround! Glad it's all more or less working.