flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.47k stars 584 forks source link

[BUG] Accessing attributes fails on complex types #5427

Closed architrathore closed 2 weeks ago

architrathore commented 4 months ago

Describe the bug

As per Accessing Attributes documentation, accessing attributes on promises is supported in workflows. This works for simple types like str or int, but fails during workflow compilation for complex types such as accessing inner dataclass on a nested dataclasses.

Expected behavior

Accessing attributes should work for all types, not just primitive ones.

Additional context to reproduce

from __future__ import annotations

from dataclasses import dataclass

from flytekit import task, workflow
from mashumaro.mixins.json import DataClassJSONMixin

@dataclass
class Fruit(DataClassJSONMixin):
    name: str

@dataclass
class NestedFruit(DataClassJSONMixin):
    sub_fruit: Fruit
    name: str

@task
def dataclass_task() -> Fruit:
    return Fruit(name="banana")

@task
def dataclass_task_nested() -> NestedFruit:
    return NestedFruit(sub_fruit=Fruit(name="banana"), name="nested_name")

@task
def print_message(message: str):
    print(message)
    return

@task
def print_message_fruit(fruit_instance: Fruit):
    print(fruit_instance.name)
    print(fruit_instance)
    return

@task
def print_message_nested(nested_fruit: NestedFruit):
    print(nested_fruit.sub_fruit.name)
    print(nested_fruit.name)
    return

@workflow
def dataclass_wf():
    fruit_instance = dataclass_task()
    nested_fruit_instance = dataclass_task_nested()

    # ✅ accessing attribute of type=str 
    print_message(message=fruit_instance.name)
    # ✅ non attribute access 
    print_message_nested(nested_fruit=nested_fruit_instance)
    # ✅ accessing attribute of type=str on a nested dataclass
    print_message(message=nested_fruit_instance.sub_fruit.name)
    # ❌ accessing attribute of type=dataclass
    print_message_fruit(fruit_instance=nested_fruit_instance.sub_fruit)

When registering the workflow, this fails with the following error:

Error 0: Code: MismatchingTypes, Node Id: n4, Description: Variable [o0] (type [simple:STRING]) doesn't match expected type [simple:STRUCT  metadata:{fields:{key:"additionalProperties"  value:{bool_value:false}}  fields:{key:"properties"  value:{struct_value:{fields:{key:"name"  value:{struct_value:{fields:{key:"type"  value:{string_value:"string"}}}}}}}}  fields:{key:"required"  value:{list_value:{values:{string_value:"name"}}}}  fields:{key:"title"  value:{string_value:"Fruit"}}  fields:{key:"type"  value:{string_value:"object"}}}  structure:{dataclass_type:{key:"name"  value:{simple:STRING}}}].
Error 1: Code: ParameterNotBound, Node Id: n4, Description: Parameter not bound [fruit_instance]

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

welcome[bot] commented 4 months ago

Thank you for opening your first issue here! 🛠

runllm[bot] commented 4 months ago

The issue you're encountering is related to Flyte's handling of complex types, particularly nested dataclasses. According to this GitHub issue, nested dataclasses are not fully supported in Flyte's type system, which can lead to type mismatches during workflow compilation.

Would you like more detail on potential workarounds or further explanation of the issue?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

kumare3 commented 4 months ago

Cc @ByronHsu

kumare3 commented 2 weeks ago

covered by https://github.com/flyteorg/flyte/issues/5318

Future-Outlier commented 2 weeks ago

I think this is the bug of propeller, but not related to the JSON IDL. I'm investigating.