Closed aboomer07 closed 1 year ago
love this library
❤️
You can use aliases in this case: https://pandera.readthedocs.io/en/stable/schema_models.html#aliases
import pandera as pa
class Schema(pa.SchemaModel):
col1: pa.typing.Series[int] = pa.Field(alias=("level1", "col1"))
col2: pa.typing.Series[int] = pa.Field(alias=("level1", "col2"), check_name=True)
print(Schema.to_schema())
output:
<Schema DataFrameSchema(
columns={
'('level1', 'col1')': <Schema Column(name=('level1', 'col1'), type=DataType(int64))>
'('level1', 'col2')': <Schema Column(name=('level1', 'col2'), type=DataType(int64))>
},
checks=[],
coerce=False,
dtype=None,
index=None,
strict=False
name=Schema,
ordered=False,
unique_column_names=False
)>
Thanks!
I'm not sure if this is a feature request or documentation improvement, haven't been able to find it in the documentation yet if it exists.
It seems like you can only use the from_format for loading schemas from a file in the SchemaModel configuration, but you can only specify a multi-level column index in a DataFrameSchema, with no way to convert from DataFrameSchema to SchemaModel, or to load a yaml file into a SchemaModel.
Apologies if I missed anything, love this library!