Closed merveenoyan closed 1 year ago
@E-Aho @adrinjalali This is a veritable bug, here is a minimal reproduction:
from sklearn.base import BaseEstimator
from operator import methodcaller
class MyEstimator(BaseEstimator):
def __init__(self):
pass
def fit(self, X, y):
self.meth_ = methodcaller("foo")
return self
est = MyEstimator().fit(None, None)
sio.loads(sio.dumps(est), trusted=True)
Resulting in:
--> 397 instance = cls.__new__(cls) # type: ignore
399 if not self.children["attrs"]:
400 # nothing more to do
401 return instance
TypeError: methodcaller needs at least one argument, the method name
The offending line:
This could be an interesting one to resolve. Presumably, methodcaller
is not the only type that fails like this, we should ensure that other similar types work too.
@merveenoyan methodcaller
is used here:
The script could be rewritten to not use methodcaller
and should then pass.
I was confused for a while, since methodcaller
is defined here (https://github.com/python/cpython/blob/498598e8c2d64232d26c075de87c513415176bbf/Lib/operator.py#L302) as:
class methodcaller:
"""
Return a callable object that calls the given method on its operand.
After f = methodcaller('name'), the call f(r) returns r.name().
After g = methodcaller('name', 'date', foo=1), the call g(r) returns
r.name('date', foo=1).
"""
__slots__ = ('_name', '_args', '_kwargs')
def __init__(self, name, /, *args, **kwargs):
self._name = name
if not isinstance(self._name, str):
raise TypeError('method name must be a string')
self._args = args
self._kwargs = kwargs
def __call__(self, obj):
return getattr(obj, self._name)(*self._args, **self._kwargs)
def __repr__(self):
args = [repr(self._name)]
args.extend(map(repr, self._args))
args.extend('%s=%r' % (k, v) for k, v in self._kwargs.items())
return '%s.%s(%s)' % (self.__class__.__module__,
self.__class__.__name__,
', '.join(args))
def __reduce__(self):
if not self._kwargs:
return self.__class__, (self._name,) + self._args
else:
from functools import partial
return partial(self.__class__, self._name, **self._kwargs), self._args
which doesn't implement __new__
. So it was puzzling why the error. Then I realized, like many other things in python, there's also a C implementation of it, and it implements __new__
here (https://github.com/python/cpython/blob/498598e8c2d64232d26c075de87c513415176bbf/Modules/_operator.c#L1563), which starts as:
static PyObject *
methodcaller_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
methodcallerobject *mc;
PyObject *name;
if (PyTuple_GET_SIZE(args) < 1) {
PyErr_SetString(PyExc_TypeError, "methodcaller needs at least "
"one argument, the method name");
return NULL;
}
Anyhow, now we know it's happening because methodcaller
has a custom __new__
method which requires an extra arg. Luckily not many objects implement __new__
, and I always thought for these objects we'd have a dedicated loader, like we do for set
, dict
etc. Cause I really don't want to rely on their __reduce__
lol
As for the issue on the hub, @merveenoyan you can change the code to avoid using methodcaller
, while we fix it here for this object.
Came across this error for only this model generated using the test case generation script in
api-inference-community
:*** TypeError: methodcaller needs at least one argument, the method name
To reproduce:
At first I thought something might've been wrong with how I serialize it but I think it's weird how every test passes and this model fails for inference.