dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.67k stars 1.47k forks source link

represent software-defined assets that are views #6474

Open sryza opened 2 years ago

sryza commented 2 years ago

Unlike non-view software-defined assets, software-defined assets that are views don't need to become stale when their upstream assets are rematerialized – it's assumed that they automatically incorporate upstream data changes immediately, without needing to be materialized themselves.

Product implications of first-class support for view assets

Possible API:

@asset(is_view=True)
def asset1():
    ...

What we've heard:

ThomasRolfsnes-EP commented 11 months ago

+1 to this! This would greatly simplify dbt runs in dagster (assuming dbt-views automatically get is_view=True)

stufan commented 11 months ago

+1

ssillaots-boku commented 11 months ago

Hey @sryza! Any updates if this will be a focus point in near future?

geoHeil commented 11 months ago

I would also love this feature! Especially for SQL/DBT usecases which are important for enterprises

sryza commented 11 months ago

@ssillaots-boku alas we don't have an ETA on this one yet

slopp commented 9 months ago

We should also consider changing how freshness policies are calculated for views. The "overdue" alert should only be fired in cases where the view's DDL has changed and the asset has not been materialized, OR the upstream has not been materialized. If the upstream is up to date and the code version is also up to date, the view is "fresh" regardless of the last time it was materialized.

Plus one from another Dagster Cloud enterprise customer for this issue

geoHeil commented 9 months ago

@slopp https://github.com/dagster-io/hooli-data-eng-pipelines/blob/master/hooli_data_eng/assets/dbt_assets.py#L114-L118 I like your approach here.

but: https://github.com/dagster-io/hooli-data-eng-pipelines/blob/master/hooli_data_eng/assets/dbt_assets.py#L158 is tedious. Is there any chance targeting views instead of tables could be more native than managing a ton of tags? Shouldn`t this be part of the manifest.json somewhere?

mjclarke94 commented 9 months ago

There are some interesting questions on how staleness is propagated in a view which will be accessed in a partitioned manner. In some cases, it would be possible to simply propagate staleness from upstream assets, but I can also think of situations where there would be a need to separately calculate a DataVersion for each partition.

Could some of the machinery from ObservableSourceAssets be leveraged here?

ssillaots-boku commented 7 months ago

Hey, @sryza ! Bugging you again. Still not in the plans?

the4thamigo-uk commented 5 months ago

I would dearly love a solution to this one also