The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
[ ] Use relative imports to bring general purpose classes into pudl.transform.* namespace:
[ ] AbstractTableTransformer
[ ] TransformParams
[ ] MultiColumnTransformParams
[ ] TableTransformParams (ambiguity here... one transform or all transforms?)
[ ] RenameColumns
[ ] StringNormalization
[ ] StringCategories
[ ] UnitConversion
[ ] ValidRange
[ ] UnitCorrections
[ ] InvalidRows
[ ] import pudl.transform.params.ferc1.TRANSFORM_PARAMS as TRANSFORM_PARAMS_FERC1
[ ] Import or relocate general purpose parameter definitions into pudl.transform.params.* namespace
[ ] unit conversions
[ ] valid ranges
Note: Do NOT import the individual transform functions, since they're meant to be used as methods.
Questions:
Should the function/parameters be split out into a separate module from the AbstractTableTransformer, since both of those are likely to grow? Seems like no... If we have people access them via pudl.transform.* namespace, then we can move them around later as needed without breaking a bunch of code -- we'll just need to change where they get imported from in __init__.py
Should the individual transform functions and multicolumn transform functions which are wrapped in methods be made _private(), to indicate that they are (primarily) meant to be accessed as methods in the TableTransformer classes?
I created hierarchical imports in all the init.py files, but am putting off this bigger interface organization question until later after conversation with @cmgosnell
pudl.transform.*
namespace:AbstractTableTransformer
TransformParams
MultiColumnTransformParams
TableTransformParams
(ambiguity here... one transform or all transforms?)RenameColumns
StringNormalization
StringCategories
UnitConversion
ValidRange
UnitCorrections
InvalidRows
import pudl.transform.params.ferc1.TRANSFORM_PARAMS as TRANSFORM_PARAMS_FERC1
pudl.transform.params.*
namespaceNote: Do NOT import the individual transform functions, since they're meant to be used as methods.
Questions:
Should the function/parameters be split out into a separate module from the AbstractTableTransformer, since both of those are likely to grow? Seems like no... If we have people access them via
pudl.transform.*
namespace, then we can move them around later as needed without breaking a bunch of code -- we'll just need to change where they get imported from in__init__.py
Should the individual transform functions and multicolumn transform functions which are wrapped in methods be made
_private()
, to indicate that they are (primarily) meant to be accessed as methods in theTableTransformer
classes?