dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.14k stars 1.4k forks source link

Dbt assets downstream of Sling assets are skipping due to non-existent skipped dependency #23859

Closed jpentland89 closed 1 week ago

jpentland89 commented 3 weeks ago

Dagster version

dagster, version 1.8.3

What's the issue?

I have a sling replication defined for two assets upstream of dbt assets constructed using @dbt_assets. The sling and dbt assets are grouped into a job. When running the job, the sling replication successfully materializes the assets, but the downstream dbt assets are skipped due to a skipped dependency that doesn't seem to exist in the graph. This started after updating to dagster 1.8.1 and has persisted to 1.8.3.

I have the following replication configuration

replication_config = {
    "source": "JRWPOSTGRES",
    "target": "DATAWAREHOUSE",
    "defaults": {"mode": "truncate"},
    "streams": {
        "quality.final_audit_inspections": {
            "object": "raw.jrw_postgres__final_audit_inspections",
        },
        "quality.final_audit_inspection_approvals": {
            "object": "raw.jrw_postgres__final_audit_inspection_approvals",
        },
    },
}

A simple translator to set the asset group and auto materialize policies

class JRWPostgresSlingTranslator(DagsterSlingTranslator):
    @classmethod
    def get_auto_materialize_policy(cls, stream_definition: Mapping[str, Any]) -> AutoMaterializePolicy | None:
        return AutoMaterializePolicy.eager()

    @classmethod
    def get_group_name(cls, stream_definition: Mapping[str, Any]) -> str | None:
        return "jrw_postgres_raw"

and sling assets defined using @sling_assets

@sling_assets(replication_config=replication_config, dagster_sling_translator=JRWPostgresSlingTranslator())
def jrw_postgres_assets(context, jrw_postgres_sling: SlingResource):
    yield from jrw_postgres_sling.replicate(
        context=context,
        replication_config=replication_config,
        dagster_sling_translator=JRWPostgresSlingTranslator(),
    )
    for row in jrw_postgres_sling.stream_raw_logs():
        context.log.info(row)

I've configured my dbt sources as follows

sources:
  - name: jrw_postgres
    schema: raw
    tables:
      - name: final_audit_inspections
        identifier: jrw_postgres__final_audit_inspections
        meta:
          dagster:
            asset_key: [target, raw, jrw_postgres__final_audit_inspections]
      - name: final_audit_inspection_approvals
        identifier: jrw_postgres__final_audit_inspection_approvals
        meta:
          dagster:
            asset_key: [target, raw, jrw_postgres__final_audit_inspection_approvals]

When running the job, this is the output I get

image

Dagster is skipping the dbt run because jrw_postgres_assets.target__raw__jrw_postgres__final_audit_inspection_approvals was skipped, but this asset doesn't exist as far as I'm aware. It seems to be some type of issue with translating the Sling assets into dependencies that dagster can map onto the dbt assets, but I don't know enough about either integration to know what's actually going on here.

What did you expect to happen?

Expected dbt assets to materialize after the upstream replication succeeded.

How to reproduce?

No response

Deployment type

Local

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

somiandras commented 2 weeks ago

Experiencing the same issue.

garethbrickman commented 2 weeks ago

@cmpadden Is this possibly root caused by https://github.com/dagster-io/dagster/pull/23837 ?

cmpadden commented 2 weeks ago

@cmpadden Is this possibly root caused by #23837 ?

Hi @garethbrickman & @jpentland89, yes I believe those are related, and that it should be fixed in 1.8.4. For the time being, I would recommend you downgrade to 1.8.0. Apologies for the inconvenience this has caused.

somiandras commented 1 week ago

Sling integration is still broken for me after upgrading to 1.8.4...

remcinerney commented 1 week ago

@garethbrickman @cmpadden This issue is still present after upgrading to 1.8.4 for me also

jpentland89 commented 1 week ago

@garethbrickman @cmpadden This issue is still present after upgrading to 1.8.4 for me also

+1, still broken for me as well

cmpadden commented 1 week ago

It looks like this change didn't make it into the 1.8.4 release, as only a few commits were cherry-picked in.

This change will be included in 1.8.5.

somiandras commented 1 week ago

@cmpadden Did this actually make it into 1.8.5? I don't see any mentions in the changelog.

cmpadden commented 1 week ago

@cmpadden Did this actually make it into 1.8.5? I don't see any mentions in the changelog.

Hi @somiandras , yes it should be included. I am not sure why the message was in the release notes, but I can see the change in the 1.8.5 tag.

image
somiandras commented 1 week ago

Awesome, thanks for confirming @cmpadden!