Open tabedzki opened 2 months ago
@zm711 for input type annotation, we talked about how to mark the combination of list, tuple, array of floats. Would using Iterable[float]
address this issue without being overly permissive?
Sorry for the delay in responding to this! Yeah I think that would actually work unless it can also contain ints. So maybe something like
Iterable[float|int]
there's also a type hint for numbers in general, but I don't see it too often in the wild. So I would tend not to use that.
What do you think @h-mayorquin ?
@zm711 According to PEP 484, anytime float
is used, int
is considered to be consistent with float
which means you wouldn't have to do float|int
Rather than requiring that users write import numbers and then use numbers.Float etc., this PEP proposes a straightforward shortcut that is almost as effective: when an argument is annotated as having type float, an argument of type int is acceptable; similar, for an argument annotated as having type complex, arguments of type float or int are acceptable. This does not handle classes implementing the corresponding ABCs or the fractions.Fraction class, but we believe those use cases are exceedingly rare.
@zm711 According to PEP 484, anytime
float
is used,int
is considered to be consistent withfloat
which means you wouldn't have to dofloat|int
I think that's fine but see https://discuss.python.org/t/clarifying-the-float-int-complex-special-case/54018 for some problems with that.
One problem with the proposed approach is that, again, it is not narrow enough:
from typing import Iterable
from typeguard import check_type
example_set = {1, 2, 3}
# Check if the object is an Iterable of int
try:
check_type(example_set, Iterable[int])
print("The object is an Iterable[int]")
except TypeError as e:
print(f"Type mismatch: {e}")
In most of our cases we want things were the order matters so I guess we want Sequence[float]
or Sequence[int]
Also relevant:
https://discuss.python.org/t/removing-sequence-str-as-base-class-of-str/56907
float
over float|int
comes from the type checkers and not Python itself and that the type checkers may not always understand when something should be treated as an int vs a float? Sequence[float]
instead of Interable
is a meaningful improvement and that should be used. __mro__
leads to unexpected results and doing just float
would lead to some unexpected behavior?Am I understanding this correctly: the issue with float over float|int comes from the type checkers and not Python itself and that the type checkers may not always understand when something should be treated as an int vs a float?
Sometimes the distinction between those two numeric types is meaningful so they should not be typed together. floats have some methods that int do not so passing int as float would be wrong. Note that this also might relevant in questions of computational efficiency. I was sharing it as a curiosity though.
I don't understand how the conversation about sequence[str] relates to this. What am I missing?
That was the previous typing that we had at some point and that's why I added here to the discussion. It was not related or relevant to the points that you were making.
Sometimes the distinction between those two numeric types is meaningful so they should not be typed together. floats have some methods that int do not so passing int as float would be wrong. Note that this also might relevant in questions of computational efficiency. I was sharing it as a curiosity though.
Ahhh understood. By typed together, you mean that we shouldn't just use float
, correct?
That was the previous typing that we had at some point and that's why I added here to the discussion. It was not related or relevant to the points that you were making.
oooh! That was the issue Zach had been referring to before. Now things make sense.
Ahhh understood. By typed together, you mean that we shouldn't just use float, correct?
I think this will end up being contextual. In an ideal scenario something should either be a float or an int, but shouldn't be both even though you could cast an int -> float like 1 -> 1.0. Things like ids are either ints | str. Durations would be floats even though if a user gave an int it wouldn't be a problem. So this is where we run into issue with typing. There's a lot of flexibility and no one right way. So I tend to at least just try to give the best "hint" for the user rather than try to nail the perfect type hint. That's why I'm okay with people that put np.dtype
instead of np.typing.DTypeLike
.
Yes, as Zach said, we should be contextual. But I think your proposal for having Sequence[something] of yours is an improvement over what we have. I can't find an argument against it.
Zach and Heberto, referring back to the issue of type hinting and avoiding bringing in a package just to get a class definition available for type hints, what are your thoughts on do something like what is suggested here, that is using something like
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .basesorting import BaseSorting
def function(B: BaseSorting):
pass
I think you would have a hard time convincing Sam that that is okay. He already hates typing so creating a new typing convention might be a step too far for him.
I can understand that. It could be used on only the files where one would have to import the file to get the type hints since leaving it as "BaseSorting" itself does not provide IntelliSense hints. It comes down to whether we think the IntelliSense hints provide enough value to include that block.
I would be in favor of a solution that uses the typing module to differentiate between annotation context and runtime out of self-interest. It would make navigation easier for me.
But as Zach mentions I have the feeling Sam would be against it. We can try and move slowly.
Something that has kept me from moving in this direction is the whole mess with PEPs 563 and 649. The implementation has been delayed for a couple of python versions. There seems to be a renewed interest on moving this forward though: https://peps.python.org/pep-0749/
Since it seems that you two work closer with Sam than I do, would you be interested in broaching these typing format to him, or should we just let it lie for now? I think there were a couple instances where I had loaded in a package into the module just for typing annotations so if there is a way forward that achieves better type annotations while removing some of the overhead implemented via my last PR, I am all ears.
Remnant check boxes from #3304:
find_redundant_units
inremove_redundant.py
. It currently returns a list of a two member list.sortingview_curation.py
, there is# TODO @alessio you remove this after testing
. I have not annotated this function since it would require importingBaseSorting
since even in quotes, Pylance does not recognize it.After ensuring most documented functions offer return values, I think there are instances where defining input type hints can be useful as the input type definitions are sometimes implicitly used in the return type annotation. For example, if we annotate
CurationSorting
to have"BaseSorting"
as part of the input indef __init__(self, sorting: "BaseSorting", make_graph=False, properties_policy="keep"):
, then the return values of.sorting()
and.current_sorting()
are automatically provided instead ofAny
. The importance of this may not be high as these are internal functions rather than external functions.