dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.35k stars 1.44k forks source link

`load_assets_from_modules` ignores `AssetSpec` assets #24148

Open maxfirman opened 1 month ago

maxfirman commented 1 month ago

Dagster version

1.8.4

What's the issue?

The load_assets_from_modules function does not load assets defined using AssetSpec.

What did you expect to happen?

load_assets_from_modules should load assets defined using AssetSpec.

The fact this doesn't work is a bit of a foot gun because Dagster will automatically create stub assets for upstream asset dependencies. It is therefore easy to assume that an AssetSpec asset has been created, when in fact it hasn't. Several people in our organisation have tripped up on this.

How to reproduce?

# assets.py

from dagster import AssetSpec

my_external_asset = AssetSpec(key="my_external_asset")
# definitions.py

from dagster import Definitions, load_assets_from_modules

from . import assets

defs = Definitions(
    assets=load_assets_from_modules([assets]),
)

image

Deployment type

Dagster Helm chart

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a šŸ‘! We factor engagement into prioritization.

tobiplay commented 1 month ago

The AssetSpec class is defined as "the core attributes of an asset, except for the function that materializes or observes it". I got them to work, i.e., show up in my lineage, by importing the external_assets_from_specs method from Dagster and passing a list with my AssetSpecs to it. Here, the actual AssetsDefinition is created from the specs.

This feature overall seems to be highly experimental still, as the function is tagged to have a deprecated status by version 1.9.0. So I don't think it's a bug actually.

maxfirman commented 1 month ago

Thanks @tobiplay. I can get assets defined using raw AssetSpec to show up without external_assets_from_specs by explicitly declaring them in the Definitions e.g:

from dagster import Definitions

from . import assets

defs = Definitions(
    assets=[assets.my_external_asset]
)

This aligns with the AssetSpec examples in the docs.

I therefore don't know if this issue would be considered a bug or not, but it is a bit counter intuitive that load_assets_from_modules doesn't work with AssetSpec.

tobiplay commented 1 month ago

Ha, yeah, that seems counterintuitive indeed; though I don't know where the AssetSpec class is heading as a whole.

I feel like load_assets_from_modules should catch the AssetSpecs natively then (if we can pass them directly to Definitions, which seems to handle the evaluation under the hood). An alternative could also be a classmethod along the lines of build_asset_from_spec to directly build an asset from the spec (which would then work again with load_assets_from_modules and accommodate for other use cases of the class).

ShootingStarD commented 1 month ago

The AssetSpec class is defined as "the core attributes of an asset, except for the function that materializes or observes it". I got them to work, i.e., show up in my lineage, by importing the external_assets_from_specs method from Dagster and passing a list with my AssetSpecs to it. Here, the actual AssetsDefinition is created from the specs.

This feature overall seems to be highly experimental still, as the function is tagged to have a deprecated status by version 1.9.0. So I don't think it's a bug actually.

Seems strange it is to be deprecated when the 1.8 release notes said to move from SourceAsset to AssetSpec

tobiplay commented 1 month ago

The AssetSpec class is defined as "the core attributes of an asset, except for the function that materializes or observes it". I got them to work, i.e., show up in my lineage, by importing the external_assets_from_specs method from Dagster and passing a list with my AssetSpecs to it. Here, the actual AssetsDefinition is created from the specs.

This feature overall seems to be highly experimental still, as the function is tagged to have a deprecated status by version 1.9.0. So I don't think it's a bug actually.

Seems strange it is to be deprecated when the 1.8 release notes said to move from SourceAsset to AssetSpec

I guess the plan is to only pass the AssetSpecs directly to the Definitions object in the future, as mentioned in the method's deprecation tag @deprecated(breaking_version="1.9.0", additional_warn_text="Directly use the AssetSpecs instead.").

ShootingStarD commented 1 week ago

The AssetSpec class is defined as "the core attributes of an asset, except for the function that materializes or observes it". I got them to work, i.e., show up in my lineage, by importing the external_assets_from_specs method from Dagster and passing a list with my AssetSpecs to it. Here, the actual AssetsDefinition is created from the specs.

This feature overall seems to be highly experimental still, as the function is tagged to have a deprecated status by version 1.9.0. So I don't think it's a bug actually.

Seems strange it is to be deprecated when the 1.8 release notes said to move from SourceAsset to AssetSpec

I guess the plan is to only pass the SourceAssets directly to the Definitions object in the future, as mentioned in the method's deprecation tag @deprecated(breaking_version="1.9.0", additional_warn_text="Directly use the AssetSpecs instead.").

The message explicitly says to use AssetSpecs and not SourceAsset so why would we pass SourceAsset to the Definitions?

tobiplay commented 1 week ago

The AssetSpec class is defined as "the core attributes of an asset, except for the function that materializes or observes it". I got them to work, i.e., show up in my lineage, by importing the external_assets_from_specs method from Dagster and passing a list with my AssetSpecs to it. Here, the actual AssetsDefinition is created from the specs.

This feature overall seems to be highly experimental still, as the function is tagged to have a deprecated status by version 1.9.0. So I don't think it's a bug actually.

Seems strange it is to be deprecated when the 1.8 release notes said to move from SourceAsset to AssetSpec

I guess the plan is to only pass the SourceAssets directly to the Definitions object in the future, as mentioned in the method's deprecation tag @deprecated(breaking_version="1.9.0", additional_warn_text="Directly use the AssetSpecs instead.").

The message explicitly says to use AssetSpecs and not SourceAsset so why would we pass SourceAsset to the Definitions?

You're right, confused the two in my reply, fixed it.