argoproj-labs / hera

Hera is an Argo Python SDK. Hera aims to make construction and submission of various Argo Project resources easy and accessible to everyone! Hera abstracts away low-level setup details while still maintaining a consistent vocabulary with Argo. ⭐️ Remember to star!
https://hera.rtfd.io
Apache License 2.0
558 stars 105 forks source link

Mypy call-arg errors when using script decorator #1174

Closed alicederyn closed 2 weeks ago

alicederyn commented 3 weeks ago

Pre-bug-report checklist

1. This bug can be reproduced using pure Argo YAML

If yes, it is more likely to be an Argo bug unrelated to Hera. Please double check before submitting an issue to Hera.

2. This bug occurs in Hera when...

Bug report

Describe the bug A clear and concise description of what the bug is:

@script-decorated functions cannot be used without # type: ignore [call-arg] # pyright: ignore [reportCallIssue].

Mypy errors:

error: Unexpected keyword argument "arguments"  [call-arg]
error: Missing named argument "name"  [call-arg]
Revealed type is "Union[None, hera.workflows.steps.Step, hera.workflows.task.Task]"

pyright errors:

error: Argument missing for parameter "name" (reportCallIssue)
error: Argument missing for parameter "job_message" (reportCallIssue)
error: No parameter named "arguments" (reportCallIssue)
Type of "result" is "Unknown"

To Reproduce Full Hera code to reproduce the bug:

from typing import TYPE_CHECKING
from hera.workflows import Steps, Workflow, script

@script()
def some_script(job_message: str) -> None:
  print(job_message)

with Workflow(name="my-workflow") as w:
    with Steps(name="steps"):
        result = some_script(arguments={"job_message": "Hello world!"})
        if TYPE_CHECKING:
            reveal_type(result)

Expected behavior A clear and concise description of what you expected to happen: No type-checker errors. Type revealed to be Step | Task.

Environment

Additional context I believe the issue comes down to the use of a union of callables here: https://github.com/argoproj-labs/hera/blob/523a9b91d5af34a6f797907d5d02ac4a89240d6a/src/hera/workflows/script.py#L624-L628

That means that the function could return any three of those callables, so callers should only make calls which are compatible with all three. This is technically correct, as without knowledge of the context, any of these three types could be returned; however, it makes the decorator very awkward to use.

I suggest we instead return a single Protocol with an overloaded function definition that assumes that the function is being called inside a context block if the parameters match, and otherwise falls back to the original signature. This would allow users to call script functions (both in context blocks and in tests) without a type: ignore. Unfortunately, this would require writing out a full signature on the first overload, with a test to keep the duplicates in-sync, rather than using ParamSpec to capture the signatures of Step and Task, as neither pyright nor mypy support having two overloads both using ParamSpec.

The return type would still be Step | Task, so users using a type-checker may need to assert isinstance(result, Step) or assert isinstance(result, Task) to pass validation. This remaining issue is covered by #837 and #911.

note: Some of the following discussion was made prior to this description being edited to focus on fixing the call-arg issue.

alicederyn commented 2 weeks ago

I dug in a bit more, and note a couple of issues:

  1. StepIns and TaskIns have the same signature, so would need to be a single @overload returning Step | Task | None. Users will need to assert isinstance(..., Step) or similar to make mypy/pyright happy. #911 could potentially make this unnecessary for mypy, as would (I believe) the experimental decorator syntax, as it does not use context-sensitive return types.
  2. The code is capturing the signature of the Step/Task constructors with a ParamSpec; neither mypy nor pyright support using this on an overload the way we want (they ignore all subsequent overloads, presumably because **kwargs matches everything regardless of the type hint on it). We could drop the ParamSpec technique for capturing the Step/Task signatures and instead write out a full signature, with a test to keep the duplicates in-sync.
elliotgunton commented 2 weeks ago

There's an existing issue with some more context and history here https://github.com/argoproj-labs/hera/issues/837 - could you add comments on there please (and then close this)?

I think this is a problem where we'll need a mypy plugin, because we're doing non-standard Python stuff where you could call Hera a sub-language of Python.

Re

StepIns and TaskIns have the same signature

Tasks actually have the extra dependencies and depends so aren't quite identical

alicederyn commented 2 weeks ago

I think this is a problem where we'll need a mypy plugin, because we're doing non-standard Python stuff where you could call Hera a sub-language of Python.

I would like to improve the status quo without a plugin, as pyright does not support plugins, and this is making it hard to use VS Code. I think using overloads would be an improvement, as users could use an assert on the result rather than having to completely suppress typechecking as is currently required.

Will clean up the issue duplication.

elliotgunton commented 2 weeks ago

users could use an assert on the result

This or using cast is currently possible with the union to keep types around, and for an improvement we'd need the revealed type to be Step (not Step | None) otherwise a cast or assert is still needed.

I think using overloads is the right way to go though 💯 (And I think with the right func signatures we can fix the Step | None issue)

But I would really want to avoid

write out a full signature, with a test to keep the duplicates in-sync

if possible

alicederyn commented 2 weeks ago

This or using cast is currently possible with the union to keep types around

Unfortunately not, please see my issue description.

alicederyn commented 2 weeks ago

@samj1912 I decided to instead update this issue to focus on the new call-arg issue, leaving #837 open to resolve the older issue of "the linter still doesn't know it's actually a Task necessitating casts or # type: ignores", since even if we resolve the former, the latter will still be open. Is that ok?

I also went into more detail about my proposed solution; I'll open a PR with my suggested fix, for discussion.

alicederyn commented 2 weeks ago

Uncovered one further issue when working on the PR: name is required on Task and Step, but it is not required when calling a @script-decorated function, so copying the signature from Task/Step is not correct anyway.