Closed internetofjames closed 3 years ago
I found a temporary work-around - use keyword arguments.
import pandera as pa
import pandera.typing as pat
class A(pa.SchemaModel):
a: pat.Series[int]
class Config:
strict = True
class B(A):
b: pat.Series[int]
class Config:
strict = True
class X:
@pa.check_types
def f(self, df: pat.DataFrame[A], df2: pat.DataFrame[B]) -> None:
pass
def g(self, df: pat.DataFrame[A], df2: pat.DataFrame[B]) -> None:
A.validate(df)
B.validate(df2)
a = pat.DataFrame({'a': [1, 2, 3]})
b = a.copy()
b[B.b] = 10
X().f(df=a, df2=b) # OK - expected -> temporary work-around
X().g(a, b) # OK - expected
X().f(a, b) # NOT OK - not expected
Thanks for the bug report and clear examples. The issue seems to be that the implicit positional argument self
or cls
is not re-injected when calling the method after having validated its inputs. https://github.com/pandera-dev/pandera/blob/a0813b7d85c5564fa98554196d459477ca2d272f/pandera/decorators.py#L506
@cosmicBboy I can take care of that.
First, a few quick reminders:
classmethods
can be thought of as instance methods on a metaclass). That is, regular methods and class methods are bound, while static methods are unbound.wrapt.decorator
does some further argument processing: separating the instance (if any) from the rest of the positional arguments. By contrast, functools.wraps
simply passes the arguments (positional and keyword) through.The problem here is that when we re-bind argument values (that come via wrapt
) to argument names (that come via inspect.signature
) there is a mismatch when dealing with "self".
@jeffzi @cosmicBboy I already fixed it, so I will just open a pull request. But below is the main part of the fix. This is similar to how wrapt
itself checks whether the instance is None
: e.g. https://github.com/GrahamDumpleton/wrapt/blob/118cdf51bb3d43e22886fea91112361144ed3ee0/src/wrapt/decorators.py#L248
@wrapt.decorator
def _wrapper(
wrapped: Callable,
instance: Optional[Any],
args: Tuple[Any, ...],
kwargs: Dict[str, Any],
):
def validate_args(arguments: OrderedDict[str, Any]) -> Dict[str, Any]:
return {arg_name: _check_arg(arg_name, arg_value) for arg_name, arg_value in arguments.items()}
if instance is not None:
# If the wrapped function is a method -> add "self" as the first positional arg
args = (instance, *args)
validated_pos = validate_args(sig.bind_partial(*args).arguments)
validated_kwd = validate_args(sig.bind_partial(**kwargs).arguments)
if instance is not None:
# If the decorated func is a method, wrapped is already a bound method -> remove the "self" argument
first_pos_arg = list(sig.parameters)[0]
del validated_pos[first_pos_arg]
out = wrapped(*validated_pos.values(), **validated_kwd)
return _check_arg("return", out)
Thanks all -- I appreciate the good discussion and quick work! pandera
looks to be a very promising tool and I am excited to see where the future takes it! 😎
fixed by #530
I am using the latest
pandera
version 0.6.4. I noticed while usingSchemaModel
s that thepandera.check_types
decorator does not validate annotated arguments of a method, whether it be a regular method, static method, or class method. Below is a simpleSchemaModel
, and a simple class with a method that simply prints the columns of the DataFrame.Expected behavior
Since the schema's
Config
hasstrict = "filter"
, I should see the extrafloat_col
of the DataFrame dropped automatically before the columns are printed. Compared to the identical function below, that is not the case.Output: