Closed sstadick closed 2 years ago
This is an unfortunately side-effect of pandas defaulting to a float dtype if you try to initialize a dataframe with empty lists.
In [2]: pd.DataFrame({"a": [], "b": []}).dtypes
Out[2]:
a float64
b float64
dtype: object
If you use the coerce=True
config:
class Example(pa.SchemaModel):
counts: Series[int]
values: Series[int]
class Config:
coerce = True
Then pandera will do the type coercion for you, otherwise this behavior is expected.
DataFrame[Example]({"counts": [], "values": []})
It's unclear what exactly to do in this case, except to explicitly use an empty series with dtypes specified. But if you want to use empty lists coerce=True
is the way to go.
Got it, the coerce=True
works for me, and solves my immediate problem of getting errors when I sometimes have no data.
Thank you for the speedy reply!
Describe the bug When validating data that which is empty, pandera assumes the inner type is a
float64
.Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
See minimal example (small poetry project) here.
Expected behavior
I expected pandera to validate an empty list and not try to check the inner type since there is no inner type.
Desktop (please complete the following information):
mypy
extras