Closed PascalStehling closed 7 months ago
Okay, thanks for submitting this to me. I see what you're doing here now and have some thoughts.
types
module in key places to know that the Pandas values are compatible.
pandas.Series
.pandas.Series
is a DataType.ARRAY
.pandas.Series
value is a DataType.ARRAY
. This will also require iterable_member_value_type
to be updated to ensure the member type is propagated as well.Adding the type information for Pandas to the types
module I think will be easier to maintain long term, especially since the AST module is already pretty complicated. It would ensure that as new AST operations are added, they won't necessarily need to account for Pandas specifically but this makes the large assumption that pandas.Series
works in the same was as a Python tuple
and supports all the same operations. If that assumption is incorrect, then things will be much more complicated because the edge cases will need to be addressed.
Thanks for your Assessment. The pd.Series
works a little bit like a tuple, but there is much more stuff you could do with it. Eg it would be super interesting and practical to enable regex search across a whole column. Also beging able to do all types of Arithmetic across the column could be quite helpful.
But as you already wrote (and I understand the codebase), there are a lot of places were changes would be necessary, which would not make the Code easier to read.
Maybe other tools would be a better fit, instead of trying to force push such a big framework into a codebase that was not really designed for doing such stuff.
But thanks again for taking your time :)
As described in #84, some changes to support the usage of pandas as datatype.