flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.17k stars 550 forks source link

[BUG] Union types fail for e.g. two different dataclasses #5489

Open fg91 opened 1 week ago

fg91 commented 1 week ago

Describe the bug

from typing import Union
from dataclasses import dataclass
from flytekit import task, workflow

@dataclass
class A:
    a: int

@dataclass
class B:
    b: int

@task
def foo(inp: Union[A, B]):
    ...

@workflow
def wf():
    foo(inp=B(b=1))

if __name__ == "__main__":
    wf()

The workflow works when executing locally with python wf.py but fails to register with flyteadmin:

flytekit.exceptions.user.FlyteInvalidInputException: USER:BadInputToAPI: error=None, cause=<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "failed to compile workflow for [resource_type:WORKFLOW project:”…” domain:"development" name:”…f” version:”…”] with err failed to compile workflow with err Collected Errors: 1
        Error 0: Code: MismatchingTypes, Node Id: n0, Description: Variable [inp] (type [simple:STRUCT]) doesn't match expected type [union_type:{variants:{simple:STRUCT metadata:{fields:{key:"additionalProperties" value:{bool_value:false}} fields:{key:"properties" value:{struct_value:{fields:{key:"a" value:{struct_value:{fields:{key:"type" value:{string_value:"integer"}}}}}}}} fields:{key:"required" value:{list_value:{values:{string_value:"a"}}}} fields:{key:"title" value:{string_value:"A"}} fields:{key:"type" value:{string_value:"object"}}} structure:{tag:"Object-Dataclass-Transformer"}} variants:{simple:STRUCT metadata:{fields:{key:"additionalProperties" value:{bool_value:false}} fields:{key:"properties" value:{struct_value:{fields:{key:"b" value:{struct_value:{fields:{key:"type" value:{string_value:"integer"}}}}}}}} fields:{key:"required" value:{list_value:{values:{string_value:"b"}}}} fields:{key:"title" value:{string_value:"B"}} fields:{key:"type" value:{string_value:"object"}}} structure:{tag:"Object-Dataclass-Transformer"}}}].

Expected behavior

As a python developer using Flyte, I would expect this workflow to work.

Additional context to reproduce

The root cause for this error is the following:

When validating the workflow in the backend, here, the so-called unionTypeChecker checks whether one of the union variants (here A or B) can unambiguously be chosen:

func (t unionTypeChecker) CastsFrom(upstreamType *flyte.LiteralType) bool {
    ...
    // Matches iff we can unambiguously select a variant
    foundOne := false
    for _, x := range unionType.GetVariants() {
    if AreTypesCastable(upstreamType, x) {
    if foundOne {
        return false
    }
    foundOne = true
    }
}

return foundOne
}

For our example above, AreTypesCastable(upstreamType, x) yields true for both union variants A and B which causes the unionTypeChecker to fail.

The reason that for both A and B the check AreTypesCastable(upstreamType, x) results in true is the following:

Here, the so-called trivialChecker which is called for both union variants A and B compares whether the passed input B matches the respective variant:

func (t trivialChecker) CastsFrom(upstreamType *flyte.LiteralType) bool {
         ...

    if GetTagForType(upstreamType) != "" && GetTagForType(t.literalType) != GetTagForType(upstreamType) {  # There are no tags that could solve the ambiguity in our example
        return false
    }

    // Ignore metadata when comparing types.
    upstreamTypeCopy := *upstreamType
    downstreamTypeCopy := *t.literalType
    upstreamTypeCopy.Structure = &flyte.TypeStructure{}
    downstreamTypeCopy.Structure = &flyte.TypeStructure{}
    upstreamTypeCopy.Metadata = &structpb.Struct{}
    downstreamTypeCopy.Metadata = &structpb.Struct{}
    upstreamTypeCopy.Annotation = &flyte.TypeAnnotation{}
    downstreamTypeCopy.Annotation = &flyte.TypeAnnotation{}

    return upstreamTypeCopy.String() == downstreamTypeCopy.String()
}

Since there are no tags that solve the ambiguity and all metadata is ignored, the final string comparison for both union variants A and B is always "simple:STRUCT metadata:{} annotation:{} structure:{}" == "simple:STRUCT metadata:{} annotation:{} structure:{}" meaning that we cannot determine whether the union variant A or B is the correct match for the passed value B.

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

runllm[bot] commented 1 week ago

The issue arises because Flyte's backend cannot unambiguously select a variant from the union type when validating the workflow. This is due to the unionTypeChecker failing to differentiate between the union variants A and B as both are treated as simple:STRUCT.

Would you like more detailed steps on how to address this issue or further explanation on the underlying mechanics?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

mohsinmahmood12 commented 1 week ago

@RunLLM Can I work on this issue ?

runllm[bot] commented 1 week ago

The issue arises because Flyte's backend cannot unambiguously select a variant from the union type when validating the workflow. This is due to the unionTypeChecker failing to differentiate between the union variants A and B as both are treated as simple:STRUCT.

Would you like more detailed steps on how to address this issue or further explanation on the underlying mechanics?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

eapolinario commented 1 week ago

We're in the process of revisiting the dataclass transformer and will include this in the process.

A good point raised during today's contributor meeting is that simply adding the tags to the struct message is not enough as that would break inheritance.