Closed hagenw closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.0%. Comparing base (
ec72190
) to head (e108c98
). Report is 2 commits behind head on main.
typing.Sequence
should be fine as the code will work with every object that implements __iter__
.
But to support that, it is usually easier to convert everything to a fixed output type, instead of returning the original type. I also prefer to have a fixed output type in basic functions instead of defining the output type by the input type.
typing.Sequence
should be fine as the code will work with every object that implements__iter__
.But to support that, it is usually easier to convert everything to a fixed output type, instead of returning the original type. I also prefer to have a fixed output type in basic functions instead of defining the output type by the input type.
I see: a DataFrame
is not a sequence:
import pandas as pd
import collections.abc
df = pd.DataFrame.from_dict({
'v1': [1,2,1,2,4,1],
'v2': [2,4,4,2,3,1],
}
)
typecheck = isinstance(df, collections.abc.Sequence)
print(typecheck)
print(df.__iter__)
results in
False
<bound method NDFrame.__iter__ of v1 v2
0 1 2
1 2 4
2 1 4
3 2 2
4 4 3
5 1 1>
In other words: pd.Dataframe
implements __iter__
but the object is not a sequence.
Interestingly, a pd.Series
is not a sequence either. Still unique
works on it.
ser = pd.Series([1,2,1,2,4,1])
isinstance(ser, collections.abc.Sequence)
>>> False
But I am happy with it and will approve it as this becomes to academic. I think one should add a string to the test case and then I will approve.
I accidentally pressed the "close" button next to the comment button. Reopening. Apologies.
In the current Python documentation, they write:
and we have
>>> isinstance(pd.Series([1]), collections.abc.Iterable)
True
So I changed the type to typing.Iterable
, which seems to be a better fit to me. Do you agree?
So I changed the type to
typing.Iterable
, which seems to be a better fit to me. Do you agree?
Yes. A DataFrame
is also an iterable though. I have approved already - it is probably hard to impossible to completely get it right with the python typing system: when one goes through the tests one can see what is going on. Alright with me.
It might be that a user wants to get unique values of a sequence, in the order they are appearing in that sequence.