dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.14k stars 1.4k forks source link

DagsterSlingTranslator does not recognise target_prefix argument #20485

Closed temminks closed 5 months ago

temminks commented 6 months ago

Dagster version

1.6.9

What's the issue?

I copied the replication.yaml from the embedded-elt docs and tried to customise the target asset key's prefix, using the target_prefix argument in DagsterSlingTranslator(). The parameter defaults to target.

@sling_assets(replication_config=replication_config)
def gfdi_to_transformator_assets(context, sling: SlingResource):
    yield from sling.replicate(
        replication_config=replication_config,
        dagster_sling_translator=DagsterSlingTranslator(target_prefix="my_prefix"),
    )

I'd expect the asset's key to be prefixed with my_prefix, but the prefix stays target. This is unexpected and different from what the documentation describes.

Bildschirmfoto 2024-03-14 um 18 18 33

What did you expect to happen?

The prefix of the asset should be set to the value provided in target_prefix of the DagsterSlingTranslator.

How to reproduce?

Follow the documentation for Dagster Embedded ELT (https://docs.dagster.io/integrations/embedded-elt). and try setting the target_prefix parameter.

Deployment type

Local

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a šŸ‘! We factor engagement into prioritization.

cmpadden commented 5 months ago

EDIT: disregard this message, and see the comment below!


Hi @temminks - thanks again for reporting this.

If you override the class without the @dataclass decorator, then the overridden target_prefix attribute will not be set, however, the use you showed with DagsterSlingTranslator(target_prefix='custom') seems like it should work -- I will investigate further.

from dataclasses import dataclass

@dataclass
class ExampleTranslator:
    target_prefix: str = "target"

t1 = ExampleTranslator()
t1.target_prefix
# 'target'

@dataclass  # <-- with dataclass decorator
class Custom(ExampleTranslator):
    target_prefix: str = "custom"

t2 = Custom()
t2.target_prefix
# 'custom' <-- šŸ‘ 

class CustomNotDataclass(DagsterSlingTranslator):
    target_prefix: str = "custom"

t3 = CustomNotDataclass()

t3.target_prefix
# 'target' <-- šŸ‘Ž 

t4 = DagsterSlingTranslator(target_prefix='custom')
t4.target_prefix
# 'custom'  <-- šŸ‘ 
cmpadden commented 5 months ago

@temminks, Ok, I think I've found the issue with what you've shared above. Both @sling_assets and replicate() take a DagsterSlingTranslator implementation. You've only added it to the replicate call. If you change this to the following it should work as expected.

# before
@sling_assets(replication_config=replication_config)
def gfdi_to_transformator_assets(context, sling: SlingResource):
    yield from sling.replicate(
        replication_config=replication_config,
        dagster_sling_translator=DagsterSlingTranslator(target_prefix="my_prefix"),
    )

# after
custom_dagster_sling_translator = DagsterSlingTranslator(target_prefix="my_prefix")
@sling_assets(replication_config=replication_config, dagster_sling_translator=custom_dagster_sling_translator)
def gfdi_to_transformator_assets(context, sling: SlingResource):
    yield from sling.replicate(
        replication_config=replication_config,
        dagster_sling_translator=custom_dagster_sling_translator,
    )

I believe that we can remove the dependency to pass the translator in two locations; I will create a ticket to track such an enhancement.

cmpadden commented 5 months ago

Just to follow up here, a code change was pushed that makes it so you only have to specify the translator and replication config in the asset decorator. This is to avoid the confusion of the above scenario. You can see the details here:

https://github.com/dagster-io/dagster/pull/20564

This change also makes it so you can use the Sling integration from an @op directly if desired.