Closed the4thamigo-uk closed 5 months ago
Hi @the4thamigo-uk, AssetGraph
(which as you note is a private API) is under very heavy surgery right now, so it's not surprising you're seeing problems. There is a fresh batch of changes that will roll out next week that includes performing "normalization" on the assets defs passed in to the AssetGraph
, which will create stub defs for any referenced asset keys otherwise lacking a def-- I'm guessing that is going to fix your problem here, so my advice is to try upgrading to next week's release.
Let's leave the issue open until then and revisit if that doesn't work.
there doesn't seem to be a publicly-exposed way to request the stale status of a given asset/partition.
That's a good point, the closest thing we have is the GQL API. I'm going to put adding a Python API for this in our internal queue.
Thanks @smackesey ... sounds good...
Looks like the 1.6.9 release might have fixed this, was that expected?
Looks like the 1.6.9 release might have fixed this, was that expected?
It wouldn't be surprising, since AssetGraph
is under heavy surgery. Since it's a private API I can't recall exactly which changes are falling into which release.
@smackesey I have also noticed that if I have an partitioned asset that depends on an external partitioned asset, the staleness of the asset doesnt seem to reflect materialization events on the external asset. I dont get the new data indication in the UI, and CachingStaleStatusResolver
seems to return the asset as FRESH. How is this meant to work?
Similar problem if I raise asset observation events on the external asset.
@smackesey btw, seems we can close this. ON another note do you have any idea when the stale status API will become available? I also created an issue about this a while ago https://github.com/dagster-io/dagster/issues/19368
Dagster version
1.6.7
What's the issue?
I'm facing an issue with
CachingStaleStatusResolver
in v1.6.7. This seems to occur when the asset graph contains a@dbt_asset
that is based on a dbt project that has one or more dbt 'source'-s defined. I wasnt seeing this issue inv1.6.6
, and I notice there are quite a few changes in this area (https://github.com/dagster-io/dagster/pull/19901).I am using
CachingStaleStatusResolver
to work out which partitions are stale for a job, in order to generate runs in a schedule, and this was working fine up to and includingv1.6.6
. I am callingCachingStaleStatusResolver.get_status()
on the asset keys of the job's assets, in order to determine their status. In order to useCachingStaleStatusResolver
, I am creating anAssetGraph
from my project'sAssetDefinitions
to do this. The same behaviour occurs if I use the asset graph accessed through the repository object of theDefinitions
instance, so it seems not to be related to the way that I construct the asset graph.However, in v1.6.7, when I include a
@dbt_asset
in the Asset graph, and I request the status of the dagster asset that represents a 'model' that is dependent on the dbt 'source', then_get_stale_causes_materialized()
checks the status of the parent asset keys. Included in these keys is the key for the dbt 'source', but this does not seem to be represented in theAssetGraph
at all, so I am seeing aKeyError
inAssetGraph.is_external()
. It seems only fullSourceAsset
s are added to theAssetGraph
when I useAssetGraph.from_assets
, and it looks like dbt 'source' nodes are not represented this way in dagster.I realise that this is an internal class, but there doesn't seem to be a publicly-exposed way to request the stale status of a given asset/partition. However, I would have thought that creating an
AssetGraph
with a dbt_asset, and subsequently callingget_status()
on a 'model' asset, should not crash with an exception?More detail: Interestingly in the v1.6.6 code I also notice that there is no asset in the graph that represents the dbt 'source' either, but the only reason
AssetGraph.is_source()
doesn't fail in v1.6.6, is that the function returnsTrue
if the asset_key is not in the list of materializable keys, sois_source("ahdsfjlkad")
also returnsTrue
.What did you expect to happen?
I don't expect it to crash.
How to reproduce?
I can provide an example project if necessary.
Deployment type
None
Deployment details
No response
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.