Open astrojuanlu opened 1 year ago
This line makes MyPy happy but it's not as nice:
pipelines["__default__"] = sum(pipelines.values(), start=Pipeline([]))
Hmm, this is an annoying one. I'm sure other people must have come across this also since it's pretty common to do sum
like this on custom classes without specifying start
.
Does it help if we add a type annotation to __radd__
?
def __radd__(self, other: Union[Pipeline, Literal[0]])
Definitely I'd prefer to not explicitly specify start=Pipeline([]))
- we could just put a mypy ignore in there if required instead?
I think the problem is that find_pipelines
always returns {"__default__": Pipeline([])}
even if it finds nothing, but this dynamic behavior is difficult to communicate to MyPy. Even with the __radd__
annotation, that wouldn't help with the mismatch on the register_pipelines
annotation.
I think there only ways are either ignoring the error (always feels like surrender 🏳️) or try to assure MyPy that somehow pipelines.values()
will never be empty, but the latter necessarily requires code changes. An alternative to start=
could be
pipelines["__default__"] = sum(pipelines.values()) or Pipeline([])
still looks meh but at least one does not need to know about the obscure sum
behavior.
I feel like overriding the sum return type should work, like so:
from typing import Iterable, overload
@overload
def sum(__iterable: Iterable[Pipeline]) -> Iterable[Pipeline]: ...
The idea is to override https://github.com/python/typeshed/blob/dd2818a41db6cd31e4680abf5e1362a7e5bfb5a6/stdlib/builtins.pyi#L1731-L1748.
I need to test this properly, though; whatever I'm changing seems to be magically make mypy --no-incremental .
on a new Kedro project pass, even if I undo changes, so that can't be right... (I am able to get the error from this issue when just installing from PyPI)
This is a mostly harmless code change and would allow us to introduce MyPy as a linter. Is anybody strongly opposed to https://github.com/kedro-org/kedro/issues/2526#issuecomment-1513481730 or https://github.com/kedro-org/kedro/issues/2526#issuecomment-1512738943 ?
I am not strongly opposed but I am opposed enough to leave a comment 😀
I think that sum(pipelines.values())
is significantly nicer than the suggestions you linked to. IMO the correct solution here would be to some fix that allows that to pass mypy, e.g. @deepyaman's suggestion https://github.com/kedro-org/kedro/issues/2526#issuecomment-1513632379.
If it's too difficult to get something like that working then changing sum(pipelines.values())
is ok by me, but I do think it's worth investing a bit more time investigating alternatives because your proposed solutions make all users' pipeline_registry.py
(which is arguably already too complicated) even harder to understand.
I'm leaving this to whoever takes the ticket, since all solutions are quite small and doesn't seem to be consensus about that, but it's also quite minor
Description
The default template has this code in
pipeline_registry.py
:https://github.com/kedro-org/kedro/blob/7f447330fc4e933ae4f5f364bb38032600b78d85/kedro/templates/project/%7B%7B%20cookiecutter.repo_name%20%7D%7D/src/%7B%7B%20cookiecutter.python_package%20%7D%7D/pipeline_registry.py#L14-L15
However, latest MyPy version (1.2.0 at the time of writing) doesn't have a high opinion on that snippet:
I tested this on Python 3.10 with the default MyPy settings. I think it's reasonable because
sum
has a defaultstart=0
, therefore if thepipelines
list is empty, the result has a different type:Context
The reason why
sum(pipelines)
works at all if because of this:https://github.com/kedro-org/kedro/blob/7f447330fc4e933ae4f5f364bb38032600b78d85/kedro/pipeline/pipeline.py#L184-L186
which naturally allows this nice syntactic sugar by making this work:
Steps to Reproduce
mypy
on itExpected Result
The project templates should produce statically correct code, and give no MyPy errors.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
pip show kedro
orkedro -V
): 0.18.7python -V
): 3.10.9