Open johannkm opened 1 year ago
See https://github.com/dagster-io/dagster/discussions/17194 for an example of how to write a check on a partitioned asset currently.
Is there a plan on the roadmap to support this, or is it likely not to be supported in the near future?
This is planned for the next few months
Hey @johannkm do you have an update on Partitioned Asset checks? We are trying to work around with: https://github.com/dagster-io/dagster/discussions/17194#discussioncomment-7275029 but unfortunately it's triggering checks for every partition on materialization of a single partition
@abhischekt we've made some progress with UI designs etc. but we aren't actively working on it. I think it's likely to ship in a few months but not sooner. A hack you could experiment with: you can access context.partition_key
from inside an @asset_check
if it's a partitioned run. No guarantees with this but you could use it to only check the partition you're materializing in the same run.
+1 to the voice requesting this. All of our assets are partitioned. Really some advisory on how to build out the checks would be a good idea too. The ideal model at the moment feels like you should be generating check metadata within the asset so its partitioned, and then the checks should just be doing something very simple over that metadata. Which is not how we planned to use them, instead the plan was a QA person has their own pocket to put their code outside of being in the asset logic.
+1. Most of our assets are partitioned, and the current solutions are far from ideal.
Either the check is triggered inefficiently when backfilling multiple partitions, or the check results get overwritten by the last partition executed.
Bump. Most of our assets are partitioned as well, in particular, ML models scored on regular frequency. Until this is fixed, we will be defining custom downstream assets, which would be expected to crash if quality checks are not met for a certain partition, blocking further downstream flow. But it would be great if such checks could be defined as part of assets being checked, as post-conditions for successfull materialization ;).
Digging in the AssetCheckExecutionContext I've found that the partition key is available. So one can execute the checks only for the relevant partition:
@asset_check(asset=my_asset)
def my_check(context: AssetCheckExecutionContext, partition_data: dict[str, MyDataType]) -> AssetCheckResult:
partition_key = context.run.tags["dagster/partition"]
my_data = partition_data[partition_key]
return AssetCheckResult(
passed=some_function(my_data)
)
Obviously this is a temporary solution and Dagster should absolutely have support for checks on partitioned assets!
Bump! This feature would be a game changer for us.
Really looking forward to this feature!
Enable checks per-partition, instead of just per-asset