unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.37k stars 310 forks source link

Over ten times worse `example()` performance when field is `nullable`. #640

Closed MikiGrit closed 3 years ago

MikiGrit commented 3 years ago

Describe the bug When field is set as nullable, calling example() is more than ten times slower. I believe there is no reason for this.

Code Sample, a copy-pastable example

from timeit import timeit

import pandera as pa
import pandera.typing as pt

class MyDF1(pa.SchemaModel):
    a: pt.Series[float]

class MyDF2(pa.SchemaModel):
    a: pt.Series[float] = pa.Field(nullable=True)

# following command prints about 0.22 seconds on my computer
timeit('MyDF1.example(size=1)', setup='from __main__ import MyDF1', number=10)

# following command prints about 3.14 seconds on my computer
timeit('MyDF2.example(size=1)', setup='from __main__ import MyDF2', number=10)

Expected behavior

Both commands should take about the same time.

Desktop (please complete the following information):

cosmicBboy commented 3 years ago

fixed by #641