Add @asset_hook - which would be very similar to @asset_check

What's the use case?

I was trying to solve a problem, and realized I could abuse the @asset_check to solve it for me.

The problem I was trying to solve was:

I'd like to - whenever asset foo materializes (as a parquet file in blob storage) - I also want a copy of it as a lance file on blob storage.

I thought of solving this via an IO manager, but I didn't want to create a custom IO manager for just one asset

I thought of doing this as software defined asset that returns None but I've tried that before and it feels wrong

I thought I could wire this as an op that takes the software defined asset as a dependency - but ops are quite a differnt mental model then assets

Ok, so... I could just write an asset check? Not their purpose, but it'll work


@asset_check(asset=AssetKey('foo'))
def copy_as_indexed_lance_file(foo:pl.DataFrame)->AssetCheckResult:
    '''Copy over to blob storage as an indexed lance file'''

    # write_lance is my own method I monkeypatched into polars - don't worry about it 
    result_or_error = foo.sort('entityId', 'datetime').write_lance('az://MY-BUCKET/foo_as_lance', index_on=('entityId', 'datetime'))
    return AssetCheckResult(passed = result_or_error is None, metadata={})

Ideas of implementation

In a way I don't think anything NEEDS to be implemented, I'm just sharing that @asset_check can be used for a generic callback/hook that extends the things that happen when an asset is materialized.

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

dagster-io / dagster