DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
https://hamilton.dagworks.io/en/latest/
BSD 3-Clause Clear License
1.88k stars 126 forks source link

Bugfix pipe_output after config.when resolution #1221

Closed jernejfrank closed 1 week ago

jernejfrank commented 2 weeks ago

Addressing #1218

For pipe_output

@pipe_output(
    step(_foo).when(key="foo"),
    step(_bar).when(key="bar"),
)
def filtered_data(raw_data: pd.DataFrame) -> pd.DataFrame:
    return ...

in case no conditions are met, e.g. config={"key":"skip"}, it returnsfiltered_dataas ifpipe_output` isn't there.

I think this is better than raising an error since it leaves the ability to choose transforms at runtime (which is related to using config.when on any other function in the DAG).

I also amended tests to capture this.

sweep-ai[bot] commented 2 weeks ago

Hey @jernejfrank, here is an example of how you can ask me to improve this pull request:

@sweep Add a unit test specifically for the edge case where no conditions are met in the `pipe_output` decorator, testing the behavior with different config values that do not match any conditions.

:book: For more information on how to use Sweep, please read our documentation.