Open genedan opened 6 months ago
columns: str | list = None
? If columns is str or list, how can it be None by default?list
s contain, strings?@genedan how about (origin: str or array-like = None
, let's try to match pandas.DataFrame?
@MatthewCaseres is that actually an issue? The argument is not always used. Do you think that will cause confusion? I guess we could do (origin: str, array-like, or None, default None
?
@kennethshsu Some arguments don't describe themselves as optional, but since they have a default None
they must be optional. I was having some trouble understanding what I'm supposed to pass into the constructor, like can I omit the origin
? Can I pass a list of dictionaries into the origin
?
According to PEP 484 we need some explicit annotation to indicate a parameter is optional:
I'm thinking we could do something like:
columns: Optional[str | list] = None
or
columns: str | list | None = None
I would prefer the former since it's more explicit that the parameter is optional.
This looks really good, I like it a lot!
To @MatthewCaseres's point, I think we can do a better job with input validation. I'll add this to my backlog.
On second look, I thought this is for the docstrings, not the actual declaration in the code. But good still!
What other types can be accepted for the data
argument? The documentation has DataFrame, but we have a method _interchange_dataframe
that allows for other types to be accepted that support the __dataframe__
protocol. So this implies that the triangle constructor is somewhat flexible.
So I was thinking the accepted type for data
should be something like DataFrame, DataFrame-like
but I'm not sure what Python types would be available for "DataFrame-like". Maybe we could use array-like
but specify in the docstring that it needs to support the __dataframe__
protocol?
In the pandas source code, they annotate the input as DataFrameXchg
which comes from a stripped down version of a DataFrame used for the conversion, so I'll go with that.
import pandas as pd
from pandas.core.interchange.dataframe_protocol import (
Buffer,
Column,
ColumnNullType,
DataFrame as DataFrameXchg,
DtypeKind,
)
I'm mostly adding things like comments and type hints. Also fixes that would make the code compliant with PEP guidelines.