python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.18k stars 2.77k forks source link

False positive: No overload variant of "asdict" matches argument type "Type[DataclassInstance]" [call-overload] #17550

Open brianmedigate opened 1 month ago

brianmedigate commented 1 month ago

Bug Report

Mypy 1.11.0 gives me a false positive that 1.10.1 didn't.

Looks like it thinks that is_dataclass() narrows to Type[DataclassInstance], whereas in reality it should probably be something more like Type[DataclassInstance] | DataclassInstance.

Not sure if this is a typeshed issue or a mypy issue, but it might be related to the recent overload changes.

To Reproduce

https://mypy-play.net/?mypy=latest&python=3.8&gist=a561e9787a7710733b09df23d2fc5c04

from dataclasses import dataclass, asdict, is_dataclass
from typing import Any

@dataclass
class Foo:
    a: int

foo: Any = Foo(a=1)

if is_dataclass(foo):
    asdict(foo)

Expected Behavior

No errors

Actual Behavior

main.py:13: error: No overload variant of "asdict" matches argument type "Type[DataclassInstance]"  [call-overload]
main.py:13: note: Possible overload variants:
main.py:13: note:     def asdict(obj: DataclassInstance) -> Dict[str, Any]
main.py:13: note:     def [_T] asdict(obj: DataclassInstance, *, dict_factory: Callable[[List[Tuple[str, Any]]], _T]) -> _T
Found 1 error in 1 file (checked 1 source file)

Your Environment

timj commented 1 month ago

I also encountered this on Friday so thank you for filing this. We are using python 3.11. mypy 1.10 was fine.

Code is at https://github.com/lsst/daf_butler/blob/main/python/lsst/daf/butler/formatters/json.py#L135-L136

Example action failure: https://github.com/lsst/daf_butler/actions/runs/10014302029/job/27683746998

cbornet commented 1 month ago

I'm not sure it's a false positive. If is_dataclass narrows to Type[DataclassInstance] | DataclassInstance, then you need to ensure it's not Type[DataclassInstance] before passing to asdict. From your example, it would be fixed by

if is_dataclass(foo) and not isinstance(foo, type):
    asdict(foo)
brianmedigate commented 1 month ago

Hmm, good point. The error message confused me because it said argument type "Type[DataclassInstance]".

Thanks!

timj commented 1 month ago

I tried changing my code to:

137 if dataclasses.is_dataclass(inMemoryDataset) and not isinstance(inMemoryDataset, type):
138     inMemoryDataset = dataclasses.asdict(inMemoryDataset)

but whilst the code does work fine mypy is now complaining about a different thing:

python/lsst/daf/butler/formatters/json.py:138: error: Statement is unreachable  [unreachable]

so it looks like mypy doesn't really understand that that isinstance check filters out types.

cbornet commented 1 month ago

@timj not sure what your new problem is without more context. For me the isinstance did the job. BTW, inMemoryDataset = dataclasses.asdict(inMemoryDataset) will probably not please mypy as you're trying to assign a dict to a variable that mypy has typed to dataclass.

timj commented 1 month ago

Sorry. The context was the same lines from my previous comment pointing at my repo. To be concrete:

import dataclasses
from typing import Any

def to_dict(in_memory_dataset: Any) -> dict:
    as_dict: dict[str, Any]
    if dataclasses.is_dataclass(in_memory_dataset) and not isinstance(in_memory_dataset, type):
        as_dict = dataclasses.asdict(in_memory_dataset)
    elif hasattr(in_memory_dataset, "_asdict"):
        as_dict = in_memory_dataset._asdict()
    else:
        as_dict = {}
    return as_dict

gives me an error:

test.py:8: error: Statement is unreachable  [unreachable]
Found 1 error in 1 file (checked 1 source file)

Note that my mypy.ini has warn_unreachable = True in it (which doesn't seem to be the default).

brianmedigate commented 1 month ago

hmm, you're right. Here's a playground reproducing it: https://mypy-play.net/?mypy=latest&python=3.12&flags=warn-unreachable&gist=9453a339138852b03bf0d8f16286baf4

If I add a reveal_type() after just the dataclasses.is_dataclass(in_memory_dataset) check, it does think the type is main.py:8: note: Revealed type is "type[_typeshed.DataclassInstance]", not Union[DataclassInstance, type[DataclassInstance]] or whatever like I would expect: https://mypy-play.net/?mypy=latest&python=3.12&flags=warn-unreachable&gist=0948b2c430cab1d09fa0a378e772172b

Maybe this is still an issue.

MarcinKonowalczyk commented 1 month ago

bump. encountered it too after an update to mypy 1.11.0 in python 3.12.4. # type: ignore'ed for now to get on with prod.

i very much appreciate the warning about the is_dataclass narrowing to instance or class. didn't know about the latter.

timj commented 1 month ago

Thanks for the mypy playground. It shows that in mypy 1.11 the type from the is_dataclass test is:

main.py:8: note: Revealed type is "type[_typeshed.DataclassInstance]"

but the type from 1.10.1 is still Any.

The bug seems to be that after is_dataclass the type should be a type[DataclassInstance] | DataclassInstance.

brianmedigate commented 1 month ago

related: https://github.com/python/typeshed/issues/12401

Looks like it really did show up in this version of mypy because of the overload changes. Also looks like it might be a problem without an easy fix, because it's not clear there's a better alternative than the current implementation.

@timj your case can be fixed by changing the order of the checks from if dataclasses.is_dataclass(in_memory_dataset) and not isinstance(in_memory_dataset, type): to if not isinstance(in_memory_dataset, type) and dataclasses.is_dataclass(in_memory_dataset): because in the first case it matches the first overload, and then the isinstance-of-type check narrows it to unreachable, but in the second case we first narrow the type to knowing that it isn't a type, then it matches the second overload and correctly narrows to DataclassInstance

timj commented 1 month ago

@brianmedigate, thanks. Changing the order of the checks does fix my unreachable statement problem.

hauntsaninja commented 1 month ago

I believe the relevant change included in mypy 1.11 is https://github.com/python/typeshed/pull/11929

gwk commented 1 week ago

I just came across this problem in my own code. Not sure what the most correct resolution is, but here is a complete implementation of is_dataclass_instance for python 3.12 that appears to work. Any feedback is appreciated!

from typing_extensions import TypeIs # Will be included in python3.13.

class DataclassInstance(Protocol):
  'Copied from typeshed/stdlib/_typeshed/__init__.py.'
  __dataclass_fields__: ClassVar[dict[str, Field[Any]]]

def is_dataclass_instance(obj:Any) -> TypeIs[DataclassInstance]:
  '''
  Returns `True` if `obj` is a dataclass instance. Returns `False` for all type objects.
  It is often more correct to use this function rather than `is_dataclass`
  it is easy to forget that `is_dataclass` returns `True` for types
  and many use cases do not intend to let type objects through.
  This function is annotated as returning `TypeIs[DataclassInstance]`
  to aid the type checker in narrowing the type of `obj`.
  '''
  return not isinstance(obj, type) and is_dataclass(obj)