catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
482 stars 111 forks source link

Purge obsolete functions and/or modules #322

Closed zaneselvans closed 5 years ago

zaneselvans commented 5 years ago

We have accumulated a fair amount of cruft (esp. in pudl.analysis) that is no longer useable, useful, or relevant to the project, and isn't going to be revived. We should delete this stuff before releasing. Anything that we do decide to keep around we should probably get into a test of some kind so the bitrot doesn't destroy it.

Nothing in pudl.analysis.analysis is currently being called at any point in the tests, though some of it was certainly getting used in notebooks.

zaneselvans commented 5 years ago

@cmgosnell and I reviewed pudl.helpers and pudl.analysis.analysis and deleted a few things from helpers. We decided not to do a full review of pudl.analysis.analysis because none of these functions are being used in the codebase -- only in notebooks. So we can safely remove the module from the packaging for v0.1.0 entirely, and deal with this for the next release.

zaneselvans commented 5 years ago

analysis functions to definitely keep and test

analysis functions to maybe keep / look at?

analysis functions to delete?

cmgosnell commented 5 years ago

Sorry this took me a while to get to...

Most of this looks right to me. I am a little hesitant to get rid of the ferc expense functions but logically you are probably correct.

iirc pretty much all of the maybe keep/look at functions were generated in service of a more granular MCOE calc and could be useful in the future. Some of these aggregations may even be useful in helping with the RMI plant mappings. I expect many of them need to be reworked or updated but we should keep them in mind before jumping in and building all new infrastructure. It might be worth revisiting them at the end of that project if it pans out.

zaneselvans commented 5 years ago

Yeah, I know, it would be nice if the FERC expense / correlation stuff was useful but... it just doesn't seem like it is or will be. I think if/when we dive into that regression/data extension stuff we probably need to start from scratch. :-/

If you're okay with it, I will chuck everything in the "delete" bin, re-locate the 2 ferc1 infrastructure / id_mapping functions, and retain the "keep / look at" functions for now so that we can maybe use them as idea-templates for the RMI work, but mark them as #nocov / #noqa / untested.

cmgosnell commented 5 years ago

This sounds like a good plan to me.

zaneselvans commented 5 years ago

Done!