PyPSA / pypsa-usa

PyPSA-USA: An Open-Source Energy System Optimization Model for the United States
https://pypsa-usa.readthedocs.io
MIT License
35 stars 15 forks source link

Add `retrieve_pudl` snakemake module #311

Closed jpvelez closed 1 month ago

jpvelez commented 2 months ago

Feature Request

We need to add a new snakemake task that download the PUDL database.

Suggested Solution

ktehranchi commented 2 months ago

Identify which build of the PUDL database to use, and grab url

I think the stable-builds from either Zenodo or the AWS buckets would work well for our purposes.

https://catalystcoop-pudl.readthedocs.io/en/latest/data_access.html#stable-builds

stephendeyoung commented 1 month ago

@jpvelez I've started this here: https://github.com/stephendeyoung/pypsa-usa/commit/5978f5d9b7db3009fd3892b4791b6d284f4680c0

I didn't have time to check the contribution guidelines so didn't create a PR yet.

I wasn't clear on adding retrieve_pudl rule to Snakefile. The other retrieve_*** scripts are not being included in the Snakefile and I can see that retrieve.smk is being included in the Snakefile. Can you clarify?

ktehranchi commented 1 month ago

@stephendeyoung This looks great, thanks! Our contribution guide is out of date (#319) ... but you can submit the PR to the develop branch.

RE: how to add retrieve_pudl to snakemake workflow- you have done it correctly. The new rule will be added to the snakemake because retrieve.smk is added in the Snakefile

stephendeyoung commented 1 month ago

Thanks @ktehranchi. I've created the PR now. The PUDL db is still gzipped after the download. LMK if it needs to be uncompressed.

ktehranchi commented 1 month ago

Yep- should be uncompressed. Thank you.

stephendeyoung commented 1 month ago

Ok that's done. It was a little more complex than anticipated because I had to decompress the file chunk by chunk (requests will do this automatically if the correct headers are set in the response but that wasn't happening in this case).

ktehranchi commented 1 month ago

Awesome, looks good! I will merge the PR, thank you!

We should be ready for #312 now.