At the moment our implementation in #[pymethods] of the __call__ member sets the tp_call slot which receives the arguments as a tuple and dictionary.
As an optimization we could instead set __vectorcalloffset__ to direct Python to use the vectorcall protocol for tp_call. I think this probably is supported on Python 3.9 and up.
This would be a fair bit of fiddling around in our macro code and probably also in the create_type_object implementation, however the speedup might be nice! 😄
At the moment our implementation in
#[pymethods]
of the__call__
member sets thetp_call
slot which receives the arguments as a tuple and dictionary.As an optimization we could instead set
__vectorcalloffset__
to direct Python to use the vectorcall protocol fortp_call
. I think this probably is supported on Python 3.9 and up.See the CPython reference - https://docs.python.org/3.9/c-api/typeobj.html#c.PyTypeObject.tp_vectorcall_offset
This would be a fair bit of fiddling around in our macro code and probably also in the
create_type_object
implementation, however the speedup might be nice! 😄