Closed markgdawson closed 1 year ago
This is a well written issue - very clear and makes complete sense to me. We will be following up on this or either fix you suggested in a PR would be great.
Thanks for the kind words @cnuernber. I've created a simple PR. I've gone with the approach of having a list of types to exclude from the default extraction for objects with PyMapping_Check
is true. I'd already found another case (Union
), with slightly differing behavior (crashes on PyMapping_Items
), and presumably there may be more cases of these.
Seems to be fixed in 0.025.
require-python
does not load any functions which are annotated with a parametric list or set in their arguments or return values.For example, if we have the following functions in a module
test_module
:and we require this with:
the function
tm/test_fn1
will be available in the current namespace, but nottm/test_fn2
.This happens because
require-python
first usesdatafy
to turn the python module into clojure map which is then iterated over to add python functions to the newly-createdtm
clojure namespace.However,
datafy
will ignore functiontest_fn2
(with warnings) and returns a map which containstest_fn1
but nottest_fn2
. This happens in this case because during reading the function metadata/annotations in the datafy call, the->jvm
function of the PCopyToJVM protocol is called to convert the function annotation information to jvm objects. The->jvm
function can't handle the annotations fortest_fn2
and throws an exception, which causes it to be skipped.It seems that annotations for parametric types of collections like
list[int]
andset[int]
are of type GenericAlias. This type is not handled correctly by->jvm
. Calls with these types are dispatched to the :default implementation of py-proto/pyobject->jvm. This fails because (surprisingly?) theseGenericAlias
objects passPyMapping_Check
(here).The docs suggestthat the presence of
__get_item__
is the only criteria required forPyMapping_Check
to return 1 (i.e. "pass"). The Python documentation forGenericAlias
also suggests that it does indeed implement a__get_item__
method, but only in order to throw an exception "to disallow mistakes".Unsurprisingly, the subsequent attempts in the
:default
handler to extract items from this generic type object fail. In the current implementation theGenericAlias
object will first fail the check forPyMapping_Items
here and then fails the check forPySequence_Check
here. The resulting exception bubbles up to the datafy outer loop.A potential super-simple fix affecting generic types only could be to change the
PyMapping_Check
here to something like this:I've tested this and it fixes the loading issues with
require-python
in this case.An alternative would be to implement a new method for the
py-proto/pyobject->jvm
multimethod for dispatch value:generic-alias
, which does the same as the final:else
in the default handler.Another (more involved) approach could be to move the
cond
dispatch logic from the method for:default
into thedefmulti
call and attempt to dispatch directly to relevant methods based on results ofpython-type
along withPyMapping_Check
,PySequence_Check
andPyMapping_Items
.