flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.27k stars 557 forks source link

[Docs] Prevent using mutable default arguments in flytesnacks #5383

Open MortalHappiness opened 2 months ago

MortalHappiness commented 2 months ago

Description

Mutable default arguments is a common anti-pattern in Python. See https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments for details. We should prevent using it in flytesnacks.

For example:

https://github.com/flyteorg/flytesnacks/blob/2d73e2a0892a56efc4fe3649db942e8270ccfe8c/examples/advanced_composition/advanced_composition/map_task.py#L13

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

runllm[bot] commented 2 months ago

Description

Mutable default arguments is a common anti-pattern in Python. See Mutable Default Arguments for details. We should prevent using it in flytesnacks.

For example:

https://github.com/flyteorg/flytesnacks/blob/2d73e2a0892a56efc4fe3649db942e8270ccfe8c/examples/advanced_composition/advanced_composition/map_task.py#L13

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

eapolinario commented 1 month ago

@MortalHappiness , please see https://github.com/flyteorg/flytesnacks/pull/1685#issuecomment-2148428296 and let me know what you think.

OTOH, enabling this in flytekit should be a no-brainer.

MortalHappiness commented 1 month ago

@eapolinario I still don't think it is a good pattern, especially in flytesnacks, which is the documentation for examples. Examples with anti-patterns may mislead users that are not proficient in Python to write these code.

Actually, I found this problem when I was working on this PR: https://github.com/flyteorg/flytekit/pull/2401

I know that if the default argument is used in workflow like the following:

@task
def t1(a: list[int]) -> list[int]:
    return a

@workflow
def wf(a: list[int] = [1, 2, 3]) -> list[int]:
    a.append(4)
    return t1(a=a)

It will show that TypeError: 'Promise' object is not callable. Therefore the list will not be modified.

However, in the PR that I mentioned above, I need to handle the case of @dynamic and @task. Therefore, users may write the following code

@task
def t1(a: list[int] = [1, 2, 3]) -> list[int]:
    a.append(4)
    return a

@workflow
def wf() -> list[int]:
    t1()
    return t1()

Note that t1 is called twice. In this case, should the result of wf be [1, 2, 3, 4] or [1, 2, 3, 4, 4]? If we make the list immutable, then the result should be [1, 2, 3, 4], which causes no problems. But in Python's perspective, although this is an anti-pattern, the "correct" result should be [1, 2, 3, 4, 4]. If users are migrated from an existing Python project, they may be confusing why the result of wf is different after adding the @task and @workflow decorator. Therefore, in my PR, if the default argument is a non-hashable object and no inputs are provided, a FlyteAssertion is raised instead to prevent this anti-pattern.

kumare3 commented 1 month ago

I agree but sadly the defaults are a way to specify defaults for default launchplans or workflows. Since the workflow is just a dsl this is ok, and workflows are static by nature it is ok - I would love to find a UX that won't break existing users too

eapolinario commented 1 month ago

Also worth mentioning that this problem only affects local executions (after we add support for default values for tasks).