wearepal / data-science-types

Mypy stubs, i.e., type information, for numpy, pandas and matplotlib
Apache License 2.0
202 stars 51 forks source link

Add Series.sort_index() signature #206

Closed krassowski closed 3 years ago

krassowski commented 3 years ago
tmke8 commented 3 years ago

Thanks for this!

Fixes wrong order of arguments in Series.sort_values() (ascending should go before inplace)

Hm, the issue is that if you do an overload based on inplace=True and inplace=False, then you cannot give both a default value.

That is, consider this:

@overload
def f(inplace: Literal[True] = ...) -> A: ...
@overload
def f(inplace: Literal[False] = ...) -> B: ...

and now you call it like this:

x = f()

what type does x have? We can't tell. So that's why we need to do it like this:

@overload
def f(inplace: Literal[True]) -> A: ...
@overload
def f(inplace: Literal[False] = ...) -> B: ...

Then we know now that f() returns B. So, you see, one of the two inplace needs to be without default argument. And arguments without default argument need to come before arguments with default argument. That's why inplace was in the "wrong" place.

I'm actually very surprised that you managed to make the tests pass with your inplace definitions.

krassowski commented 3 years ago

The tests pass because the return type for inplace=False is set to optional (thus has an overlap with None for inplace=True), which is a pattern for overriding inplace that is already used by fillna - is that stub also wrong?

krassowski commented 3 years ago

Is lack of certainty over the result in the default case (not in place, returning Series, but mypy can only tell is either Series or None) worse than incorrect hints when using positional arguments? As far as I understand the return type will be inferred correctly for non -default in place operation providing useful hint should someone try to use the result (None) as if it was a Series.

So unless I missed something this is not ideal but not wrong either - just a trade-off.

Edit: it does not work like this. It's an and not an or, so my code in the current state is wrong.

krassowski commented 3 years ago

I can obviously revert the inplace change if you wish, using positionals for sorting series is not beautiful either way because of to the dummy axis argument

krassowski commented 3 years ago

Seems to be working fine now (removing return type overlap, swapping order):

Screenshot from 2020-12-18 01-03-53

Remaining issue; disabling the type check error with a comment does not seem to work great for mypy.

krassowski commented 3 years ago

So I think I go it working, what do you think now @thomkeh? I basically accepted that it has to go with # type: ignore and as mentioned here it is fine because typeshed stubs use this approach to make accurate stubs for built-in libraries, for example see encode() stubs here. Orders matters when matching overloads so the one returning a Series has to go first.

tmke8 commented 3 years ago

I made some changes before I merged (see 3facca7a36aa1a0a617f2a1e0c8656b90670e5dd). I'd really like to avoid # type: ignore. I think what I did should cover your use case. Let me know if it doesn't.

krassowski commented 3 years ago

Interesting. Seems a bit hacky (I would say more hacky than leaving # type: ignore) but it seems to work fine. Thank you!

Edit: to clarify, the above is not a critique, your decision is fine and I was amazed to see your approach work; I get that there are reasons for doing it this way :)