We have several DTOs returned from various APIs, which need to be passed around our pipelines. However, Dagster frequently has issues with these objects, forcing us to convert them to JSON and pass them as strings. These objects inherit from pydantic.BaseModel so they should be supported.
As an aside, I would also like to see support for the pydantic_extra_types library as most packages which support/use Pydantic also support this library and we make heavy use of it.
Ideas of implementation
I suspect this would require a change to Config.__init__ to check if the item is an instance of pydantic.BaseModel. There would likely have to be some changes made to the OpDefinition and AssetDefinition to flag these objects as safe as well. However, since they are inherently compatible with JSON, they should "just work".
Additional information
Reproduction
This involves importing an Award object from a publicly available repository we maintain. This object inherits from pydantic.BaseModel. We've tried making this code work with both dagster.Config and dagster.PermissiveConfig:
from dagster import PermissiveConfig, AssetExecutionContext, build_asset_context
from mms_client.types.award import Award
from pydantic_extra_types.pendulum_dt import DateTime as PendulumDateTime
class ConvertAwardContext(PermissiveConfig): # type: ignore[misc]
hash: str
award: Award
start_time: PendulumDateTime
end_time: PendulumDateTime
@asset
def do_test(context: AssetExecutionContext, config: ConvertAwardContext) -> None:
context.log.info(f"Test award {config.award.offer_id}")
def test_converted_deal_works():
# First, create our test data
config = create_test_config()
# Next, attempt to materialize the asset
do_test(build_asset_context(), config)
Running this code results in the following error during testing:
dagster._core.errors.DagsterInvalidPythonicConfigDefinitionError:
Error defining Dagster config class <class 'ConvertAwardContext'> on field 'award'.
Unable to resolve config type <class 'mms_client.types.award.Award'> to a supported Dagster config type.
This config type can be a:
- Python primitive type
- int, float, bool, str, list
- A Python Dict or List type containing other valid types
- Custom data classes extending dagster.Config
- A Pydantic discriminated union type (https://docs.pydantic.dev/usage/types/#discriminated-unions-aka-tagged-unions)
This error makes it clear that Pydantic objects are not supported. However, converting these to bytes or str is not a great workflow and many of our data pipelines will depend on working with objects like this.
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
What's the use case?
We have several DTOs returned from various APIs, which need to be passed around our pipelines. However, Dagster frequently has issues with these objects, forcing us to convert them to JSON and pass them as strings. These objects inherit from
pydantic.BaseModel
so they should be supported.As an aside, I would also like to see support for the
pydantic_extra_types
library as most packages which support/use Pydantic also support this library and we make heavy use of it.Ideas of implementation
I suspect this would require a change to
Config.__init__
to check if the item is an instance ofpydantic.BaseModel
. There would likely have to be some changes made to theOpDefinition
andAssetDefinition
to flag these objects as safe as well. However, since they are inherently compatible with JSON, they should "just work".Additional information
Reproduction This involves importing an
Award
object from a publicly available repository we maintain. This object inherits frompydantic.BaseModel
. We've tried making this code work with bothdagster.Config
anddagster.PermissiveConfig
:Running this code results in the following error during testing:
This error makes it clear that Pydantic objects are not supported. However, converting these to
bytes
orstr
is not a great workflow and many of our data pipelines will depend on working with objects like this.Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.