unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.37k stars 310 forks source link

Propagate SchemaModel class name as name attribute of DataframeSchema #760

Closed smackesey closed 2 years ago

smackesey commented 2 years ago
class SampleSchemaModel(pa.SchemaModel):
    foo: pa.typing.Series[int] = pa.Field()

SampleSchemaModel.to_schema().name  # None

This seems unintuitive to me-- I'd expect the class name of the schema model to be propagated as the name of the DataFrameSchema. Is this intended behavior or an oversight?

jeffzi commented 2 years ago

Hi @smackesey

You need to set the config name, similarly to DataFrameSchema(..., name=...)

import pandera as pa

class SampleSchemaModel(pa.SchemaModel):
    foo: pa.typing.Series[int] = pa.Field()

    class Config:
        name = "SampleSchemaModel"

SampleSchemaModel.to_schema().name
#> 'SampleSchemaModel'

We try to have feature parity between SchemaModel and DataFrameSchema, but I do think it makes sense to use class name for SchemaModel. At worst, people do not use the name and are unaffected by a change.

Pinging @cosmicBboy for approval :)

cosmicBboy commented 2 years ago

I think using the SchemaModel class name makes sense as a fallback... specifying a name explicitly in the Config should override it... what do y'all think?

cosmicBboy commented 2 years ago

fixed by #761