coady / multimethod

Multiple argument dispatching.
https://coady.github.io/multimethod
Other
284 stars 23 forks source link

Strange behaviour with pandas function application methods #85

Closed MordorianGuy closed 1 year ago

MordorianGuy commented 1 year ago

I have found a strange behaviour if we put multimethod objects as functions in appropriate methods of pandas objects.

@multimethod
def test_multimethod(x: int) -> None:
    print("int")

@multimethod
def test_multimethod(x: float) -> None:
    print("float")

pd.Series([1, 1.0], dtype="O").apply(test_multimethod)

Hereinabove gives the result:

int
int
float
float
<class 'int'>    0    None
                 1    None
<class 'float'>  0    None
                 1    None
dtype: object

In the case of transform we get a DataFrame:

int
int
float
float
  <class 'int'> <class 'float'>
0          None            None
1          None            None

map raises the next exception:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[107], line 11
      6 @multimethod
      7 def test_multimethod(x: float) -> None:
      8     print("float")
---> 11 print(pd.Series([1, 1.0], dtype="O").map(test_multimethod))

File ~\anaconda3\envs\visualisation\Lib\site-packages\pandas\core\series.py:4539, in Series.map(self, arg, na_action)
   4460 def map(
   4461     self,
   4462     arg: Callable | Mapping | Series,
   4463     na_action: Literal["ignore"] | None = None,
   4464 ) -> Series:
   4465     """
   4466     Map values of Series according to an input mapping or function.
   4467 
   (...)
   4537     dtype: object
   4538     """
-> 4539     new_values = self._map_values(arg, na_action=na_action)
   4540     return self._constructor(new_values, index=self.index).__finalize__(
   4541         self, method="map"
   4542     )

File ~\anaconda3\envs\visualisation\Lib\site-packages\pandas\core\base.py:890, in IndexOpsMixin._map_values(self, mapper, na_action)
    887         raise ValueError(msg)
    889 # mapper is a function
--> 890 new_values = map_f(values, mapper)
    892 return new_values

File ~\anaconda3\envs\visualisation\Lib\site-packages\pandas\_libs\lib.pyx:2924, in pandas._libs.lib.map_infer()

File ~\anaconda3\envs\visualisation\Lib\site-packages\pandas\core\base.py:825, in IndexOpsMixin._map_values.<locals>.<lambda>(x)
    821 if isinstance(mapper, dict) and hasattr(mapper, "__missing__"):
    822     # If a dictionary subclass defines a default value method,
    823     # convert mapper to a lookup function (GH #15999).
    824     dict_with_default = mapper
--> 825     mapper = lambda x: dict_with_default[x]
    826 else:
    827     # Dictionary does not have a default. Thus it's safe to
    828     # convert to an Series for efficiency.
   (...)
    834     # of dtype float64 the return value of this method should
    835     # be float64 as well
    836     mapper = create_series_with_explicit_dtype(
    837         mapper, dtype_if_empty=np.float64
    838     )

File ~\anaconda3\envs\visualisation\Lib\site-packages\multimethod\__init__.py:299, in multimethod.__missing__(self, types)
    297     return self[types]
    298 groups = collections.defaultdict(list)
--> 299 for key in self.parents(types):
    300     if key.callable(*types):
    301         groups[types - key].append(key)

File ~\anaconda3\envs\visualisation\Lib\site-packages\multimethod\__init__.py:248, in multimethod.parents(self, types)
    246 def parents(self, types: tuple) -> set:
    247     """Find immediate parents of potential key."""
--> 248     parents = {key for key in self if isinstance(key, signature) and key < types}
    249     return parents - {ancestor for parent in parents for ancestor in parent.parents}

File ~\anaconda3\envs\visualisation\Lib\site-packages\multimethod\__init__.py:248, in <setcomp>(.0)
    246 def parents(self, types: tuple) -> set:
    247     """Find immediate parents of potential key."""
--> 248     parents = {key for key in self if isinstance(key, signature) and key < types}
    249     return parents - {ancestor for parent in parents for ancestor in parent.parents}

File ~\anaconda3\envs\visualisation\Lib\site-packages\multimethod\__init__.py:184, in signature.__lt__(self, other)
    183 def __lt__(self, other: tuple) -> bool:
--> 184     return self != other and self <= other

File ~\anaconda3\envs\visualisation\Lib\site-packages\multimethod\__init__.py:181, in signature.__le__(self, other)
    180 def __le__(self, other: tuple) -> bool:
--> 181     return len(self) <= len(other) and all(map(issubclass, other, self))

TypeError: object of type 'int' has no len()

Using it in a simple lambda workaround (lamda x: test_multimethod(x)) makes no worry & works as expected.

coady commented 1 year ago

A multimethod is a dict, which it's interpreting as a mapper instead of a function.

    821 if isinstance(mapper, dict) and hasattr(mapper, "__missing__"):
    822     # If a dictionary subclass defines a default value method,
    823     # convert mapper to a lookup function (GH #15999).
    824     dict_with_default = mapper
--> 825     mapper = lambda x: dict_with_default[x]

So I think it's just incompatible goals. You could use test_multimethod.__call__ as a workaround, if you prefer that to lambda.