catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Generalize `pudl.extract.raw_df_factory` to extract any partition (not just `year`) #3366

Closed cmgosnell closed 2 months ago

cmgosnell commented 4 months ago

right now pudl.extract.raw_df_factory expects the partition of the extractor to be year which is hard coded in a few places.

See comment which spurred this issue.

- [ ] generalize `year_extractor_factory` -> `partition_extractor_factory`
- [ ] generalize `extract_single_year` -> `extract_single_partition`
- [ ] generalize `years_from_settings_factory` -> `partitions_from_settings_factory`
- [ ] integrate partition changes into `raw_df_factory` & `raw_df_factory.raw_dfs`
- [ ] test new setup with existing year-based use of excel extractor (eia860, eia923, phmsagas.... (i think that's all?)
- [ ] convert `pudl.extract.eia860m.raw_eia860m__all_dfs` to use `raw_df_factory`
- [ ] ...(i'm sure i missed something)?
NateWasTaken commented 3 months ago

Hello can I take this assignment?

NateWasTaken commented 3 months ago

Sorry - forgot to tag. @catalyst-cooperative/com-dev Can I take this assignment

e-belfer commented 2 months ago

Hi @NateWasTaken! Thanks for the message. This issue has actually been addressed in #3402 and should be closed, my mistake! I'll take a look through our existing issues and see what other candidates for good first issues I can find.

NateWasTaken commented 2 months ago

Hi @NateWasTaken! Thanks for the message. This issue has actually been addressed in #3402 and should be closed, my mistake! I'll take a look through our existing issues and see what other candidates for good first issues I can find.

That would be great thank you e-belfer. My partner and I are working on a class project and hoping to contribute in any way possible!