Addresses parts of #672
Preparing to add a airflow as a preprocessor target, I noticed that the current architecture is very rigid and difficult to expand upon. This refactoring/reorganization hopefully addresses some of these. Notable changes (and absence of changes) are
No AST traversal or transformation logic has been rewritten
Files outside of shell_ast have not been changed*
preprocess/preprocess.py has been changed, but simply to import new files (some new modules were made, as I discuss below).
Differing behavior depending on target (vanilla, spec, airflow) is no longer handled by an if statement in ast_to_ast.
Previously, replace_df_regions was a function that ran a if statement on a property of TransformationState to check which specific TransformationState implementation we were dealing with (this would tell us the output target). Now, replace_df_regions is an instance method of AbstractTransformationState. This means that any variation in AST transformation between different output targets can be addressed in a single class that implements of AbstractTransformationState.
The location of the AST Transformation logic has been moved from replace_df_regions() to a method on TransformationState.
For example, this pull request created three implementations of AbstractTransformationState: TransformationState for the vanilla pash behavior, SpeculativeTransformationState for pash spec, and AirflowTransformationState for airflow (this one is empty). The ast_to_ast package contains no checks in its logic for checking how to handle AST transformations - they only need to call replace_df_regions of whatever trans_options argument they have at hand.
The TransformationState code has been moved to a separate file transformation_options.py. I do suggest renaming "State" with some more appropriate name, but this is a minor nitpick
preprocessnode functions have been
moved to its own module
no longer reliant on a large dictionary to dispatch the appropriate function
Addresses parts of #672 Preparing to add a airflow as a preprocessor target, I noticed that the current architecture is very rigid and difficult to expand upon. This refactoring/reorganization hopefully addresses some of these. Notable changes (and absence of changes) are
No AST traversal or transformation logic has been rewritten
Files outside of
shell_ast
have not been changed*preprocess/preprocess.py
has been changed, but simply to import new files (some new modules were made, as I discuss below).Differing behavior depending on target (vanilla, spec, airflow) is no longer handled by an if statement in
ast_to_ast
.Previously,
replace_df_regions
was a function that ran a if statement on a property ofTransformationState
to check which specificTransformationState
implementation we were dealing with (this would tell us the output target). Now,replace_df_regions
is an instance method ofAbstractTransformationState
. This means that any variation in AST transformation between different output targets can be addressed in a single class that implements ofAbstractTransformationState
.The location of the AST Transformation logic has been moved from
replace_df_regions()
to a method onTransformationState
.For example, this pull request created three implementations of
AbstractTransformationState
:TransformationState
for the vanilla pash behavior,SpeculativeTransformationState
for pash spec, andAirflowTransformationState
for airflow (this one is empty). The ast_to_ast package contains no checks in its logic for checking how to handle AST transformations - they only need to callreplace_df_regions
of whatever trans_options argument they have at hand.The TransformationState code has been moved to a separate file
transformation_options.py
. I do suggest renaming "State" with some more appropriate name, but this is a minor nitpickpreprocessnode functions have been