dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.64k stars 1.47k forks source link

'fetch_last_updated_timestamps' could return a result, even if some tables were not found #24245

Closed QuintenBruynseraede closed 1 week ago

QuintenBruynseraede commented 2 months ago

What's the use case?

We are using dagster_snowflake.fetch_last_updated_timestamps to generate observations, which will start some jobs that depend on the Snowflake tables. If a table is missing, this function simply raises a ValueError. It would be useful if we could somehow let it return a "partial" result containing all tables for which the last_updated timestamp was found. For our case, it means some observations would still be generated.

This would be useful for test environments where only a subset of tables exists.

Ideas of implementation

Add an argument return_partial_result: Optional[bool] = False,, which causes the function to return a Mapping[str, datetime] containing all tables found in Snowflake. Since the function already loops over the table names, it's simply a matter of ignoring failed tables instead of raising an Exception.

Additional information

If you agree on the solution I would be happy to submit a PR for it.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

garethbrickman commented 2 months ago

Hey @QuintenBruynseraede please feel free to get your idea started in a PR and we can have our developers chime in. Check out the Contributing guide here!

QuintenBruynseraede commented 1 week ago

Hey @garethbrickman I made https://github.com/dagster-io/dagster/pull/25488 in an attempt to address this issue. Let me know what you think

garethbrickman commented 1 week ago

Thanks @QuintenBruynseraede it's in our queue for review