keleshev / schema

Schema validation just got Pythonic
MIT License
2.86k stars 214 forks source link

Fix `Optional` failing when `default` is some builtins #276

Open gschaffner opened 2 years ago

gschaffner commented 2 years ago

Closes #272.

Note that inspect.signature (which is used in the current _invoke_with_optional_kwargs!) is not available in Python < 3.3, which (I believe?) schema hasn't officially dropped support for yet. I know that I am just an outsider to this project, but I'd really suggest adopting #273.

Also, I think that my implementation below is not the most elegant way to deal with this issue. It might be better to instead have Optional accept a pass_kwargs parameter rather than choosing whether or not to pass kwargs based on a heuristic. pass_args could be True by default (to pass all kwargs), but could also be False or a set of strings (representing parameter names to pass). Let me know what you think!

skorokithakis commented 2 years ago

Thanks for this! Yeah, 3.3 is ancient and wasn't very widely used, so I'd be in favor of dropping support for it. Would that make your PR easier/more elegant?

gschaffner commented 2 years ago

Sorry for the delayed response!

Following @BvB93's comment above, I think the underlying problem here is how #268 passes around kwargs. Schema is trying to pass validate's kwargs to the default argument of any Optional whose default accepts any keyword arguments. Ignoring the inspect.signature(set/etc.) issue for a moment, I don't entirely understand the rationale behind this choice. It means that if one wants to write a default argument that takes any kwargs, it also needs to accept (and ignore) extra kwargs intended for other default callables. This is a bit cumbersome when writing default inline, e.g. Optional(default=lambda of_interest, **_: of_interest + 1).

Given this, though, _invoke_with_optional_kwargs(f, **kwargs) could be replaced unconditionally with f(**kwargs) if Schema adds the requirement that default arguments of Optionals must accept arbitrary kwargs. This would look like changing lambda: 42 to lambda **_: 42 in test_optional_callable_default_ignore_inherited_schema_validate_kwargs.

This approach is fundamentally flawed for other reasons, though. For example, suppose Optional(default=dict). Now validate(some_kwarg=...) will cause the dict init to be dict(some_kwarg=...)! This is unexpected and very confusing behavior for users trying to use the validate(**kwargs) feature; they would instead need to use Optional(default=lambda **_: dict()). Also note that this dict(some_kwarg=...) issue is not a problem that can be fixed by simply detecting if issubclass(f, dict); e.g. consider (a) non-stdlib typing.Mapping types that do not inherit from dict, or (b) dict subclasses with constructors that don't accept **kwargs but do accept a non-arbitrary list of kwargs that we'd like to pass via validate(), or (c) in general constructors that accept and use arbitrary kwargs.

Now, if #268 is to be kept and not rolled back, I think that the best way to deal with this is to (a) drop Python < 3.3 so that inspect.signature can be used (see #273) (or use an inspect backport for old Pythons) and (b) make _invoke_with_optional_kwargs(f, **kwargs) inspect the signature of f and pass only the subset of the kwargs that match f's signature. If f is a builtin that inspect.signature fails on, _invoke_with_optional_kwargs should simply call it without arguments.

This would be a backwards-incompatible change that would necessitate a major version bump, but AFAIK it elegantly handles all of the relevant issues here: Optional(default=set/etc.), dict(some_kwargs=...), lambda **kwargs: f"Hello, {kwargs['name']}" rather than lambda name: f"Hello, {name}".

I've implemented this below and adjusted the tests relevant tests that were added in #268 accordingly.

gschaffner commented 2 years ago

On second thought: a major version bump is technically not necessary under SemVer even if this is merged, since Schema is still in major version 0. Schema still being v0.x.x means that if the major version is bumped, the reason for that bump should be to indicate public API stability, not to indicate backwards-incompatible changes.

I am under the impression that the public API has already been stable for some time, though, so perhaps the major version should still be bumped if this is merged.