I went back to create a DataFramePhase and got surprised by stuff we'd done just a few weeks ago... that we have to subclass DataFramePhase to override 'df_transform'
We should have a way to init a DataFramePhase with a method name passed to the instantiation that gets run in df_transform
... I'm also bending somewhat on the idea that only one step is allowed, I'm pretty sure people are going to put multiple logical steps in one if we only allow one step. Given that somebody will have a list of steps that they want to do on DataFrames (especially if they are migrating from a pandas oriented pipeline, or trying to automate a jupyter notebook worth of work) -- allowing only one step will encourage them to list all the existing in that one step.
On discussion we agree we should go back in this direction. Not only because dataframe work might have multiple steps and be passed into the constructor, but also just to allow more declarative coding rather than subclassing...
I went back to create a DataFramePhase and got surprised by stuff we'd done just a few weeks ago... that we have to subclass DataFramePhase to override 'df_transform'
We should have a way to init a DataFramePhase with a method name passed to the instantiation that gets run in df_transform
... I'm also bending somewhat on the idea that only one step is allowed, I'm pretty sure people are going to put multiple logical steps in one if we only allow one step. Given that somebody will have a list of steps that they want to do on DataFrames (especially if they are migrating from a pandas oriented pipeline, or trying to automate a jupyter notebook worth of work) -- allowing only one step will encourage them to list all the existing in that one step.
On discussion we agree we should go back in this direction. Not only because dataframe work might have multiple steps and be passed into the constructor, but also just to allow more declarative coding rather than subclassing...