dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
10.71k stars 1.33k forks source link

AMP waiting on already updated parent #20177

Closed mjkanji closed 4 months ago

mjkanji commented 4 months ago

Dagster version

version 1.5.14

What's the issue?

I am orchestrating my DBT project using the following AMP:

class DailyDagsterDbtTranslator(CustomDagsterDbtTranslator):
    def get_auto_materialize_policy(
        self, dbt_resource_props: Mapping[str, Any]
    ) -> AutoMaterializePolicy:
        return AutoMaterializePolicy.lazy().with_rules(
            AutoMaterializeRule.skip_on_not_all_parents_updated(),
            AutoMaterializeRule.materialize_on_cron(cron_schedule="0 1 * * *"),
        )

Most of my DBT DAG is behaving correctly, but I noticed that one branch has not orchestrated correctly since February 28. See the following:

image

chargify_customer_summary is an external asset made using external_assets_from_specs and is materialized daily using a job that does context.log_event(AssetMaterialization(asset_key=...)).

This job is correctly logging materialization events for chargify_customer_summary. However, its children are not materializing and the Automation page shows they are waiting for chargify_customer_summary to be updated.

See the Automation page for one of its children (customer_sub_history):

customer_sub_history_automation

Note the times. chargify_customer_summary was materialized at 5 AM (see the DAG screenshot above) but at 6 AM, customer_sub_history says it's still waiting for the parent to be updated.

Looking at the events for customer_sub_history shows successful materializations every day until February 28th:

customer_sub_history_events

Based on those dates, I'm wondering if this is somehow a leap-year bug?

What did you expect to happen?

That customer_sub_history would successfully materialize daily.

How to reproduce?

No response

Deployment type

Dagster Cloud

Deployment details

Dagster Cloud Serverless.

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

cbini commented 4 months ago

Just noticed this myself. Looks like the last time it worked was around 5 PM ET on 2/28.

OwenKephart commented 4 months ago

Hi @mjkanji thanks for the report. We've identified the issue here and a fix has been rolled out to Dagster Cloud.

For OSS users, the same fix will be available in next week's 1.6.9 release, or reverting to 1.6.6 will also avoid this issue.

mjkanji commented 4 months ago

Hi @OwenKephart, could you please clarify how the fix takes effect? I'm on Dagster Cloud Serverless and even after running my DBT DAG manually earlier in the day (to refresh the 'state' of all the assets), the assets downstream of my external assets are still not triggering at their AMP cron tick. The Automation page continues to say they're waiting for their (external asset) parents, which have already been updated.

Is the fix applied automatically for Cloud users or do I need to upgrade to the 1.6.9 release? I am on version 1.5.14 for your reference.

OwenKephart commented 4 months ago

Hi @mjkanji apologies for the confusion here -- there was a secondary fix that needed to happen on the cloud end to fully solve this issue, which has now been rolled out (as of this morning). No action is required on your end if you are a cloud user.

mjkanji commented 4 months ago

Hi @mjkanji apologies for the confusion here -- there was a secondary fix that needed to happen on the cloud end to fully solve this issue, which has now been rolled out (as of this morning). No action is required on your end if you are a cloud user.

Awesome, thanks! I was also able to confirm that my DAGs are running as expected.