Closed ErikLundin98 closed 1 year ago
hi @ErikLundin98 what version of pandas and python are you using?
@cosmicBboy, I'm using python 3.10.10 and pandas 1.3.5
So unfortunately pandas 1.3.5 has a bunch of issues with index data types... see this StringDtype xfail
test as an example: https://github.com/unionai-oss/pandera/blob/fe83c19a1aebb127f22e8bee849be70a1a96c33a/tests/core/test_schema_components.py#L838-L856
This is purely a pandas issue:
In [1]: import pandas as pd
In [2]: pd.Index([True, False])
Out[2]: Index([True, False], dtype='object')
In [3]: pd.Index([True, False], dtype=bool)
Out[3]: Index([True, False], dtype='object') # it's still an "object"!
Any chance you can update your pandas version?
Thank you for clarifying that it's a pandas issue! I will see if I can update pandas.
Given this minimal example
Expected behaviour is that the validation should pass, since the two index columns contain boolean fields.
However, instead I get the following error:
This issue seems to only occur with MultiIndex DataFrames and with boolean fields. Changing type from bool to int magically resolves the issue.
I would like to know if anyone knows a workaround for this, if I am misinterpreting anything about defining the schemas?
Thanks in advance!
I am using pandera version 0.14.4