clj-python / libpython-clj

Python bindings for Clojure
Eclipse Public License 2.0
1.08k stars 68 forks source link

Error when using `require-python` from a required namespace (specifically numba) #213

Open metasoarous opened 2 years ago

metasoarous commented 2 years ago

I'm trying to create a namespace that initializes the python connection, and goes on to define some analytic functions which can then be required from other namespaces. If I evaluate this namespace in the REPL, everything works fine. However, when I try to require this namespace from another I get TypeError: must be real number, not Float.

Click for stacktrace details
Caused by: java.lang.Exception: TypeError: must be real number, not Float
     at libpython_clj2.python.ffi$check_error_throw.invokeStatic(ffi.clj:703)
     at libpython_clj2.python.ffi$check_error_throw.invoke(ffi.clj:701)
     at libpython_clj2.python.fn$call_py_fn.invokeStatic(fn.clj:179)
     at libpython_clj2.python.fn$call_py_fn.invoke(fn.clj:177)
     at libpython_clj2.python.bridge_as_jvm$call_impl_fn.invokeStatic(bridge_as_jvm.clj:246)
     at libpython_clj2.python.bridge_as_jvm$call_impl_fn.invoke(bridge_as_jvm.clj:241)
     at libpython_clj2.python.bridge_as_jvm$make_instance_pycall$fn__27409.doInvoke(bridge_as_jvm.clj:269)
     at clojure.lang.RestFn.invoke(RestFn.java:423)
     at libpython_clj2.python.bridge_as_jvm$generic_python_as_map$reify__27414.get(bridge_as_jvm.clj:295)
     at libpython_clj2.python.bridge_as_jvm$generic_python_as_map$reify__27414$fn__27419.invoke(bridge_as_jvm.clj:324)
     at clojure.core$map$fn__5935.invoke(core.clj:2770)
     at clojure.lang.LazySeq.sval(LazySeq.java:42)
     at clojure.lang.LazySeq.seq(LazySeq.java:51)
     at clojure.lang.ChunkedCons.chunkedNext(ChunkedCons.java:59)
     at clojure.lang.ChunkedCons.next(ChunkedCons.java:43)
     at clojure.lang.RT.next(RT.java:713)
     at clojure.lang.SeqIterator.hasNext(SeqIterator.java:41)
     at clojure.core.protocols$iter_reduce.invokeStatic(protocols.clj:48)
     at clojure.core.protocols$fn__8230.invokeStatic(protocols.clj:75)
     at clojure.core.protocols$fn__8230.invoke(protocols.clj:75)
     at clojure.core.protocols$fn__8178$G__8173__8191.invoke(protocols.clj:13)
     at clojure.core$reduce.invokeStatic(core.clj:6886)
     at clojure.core$into.invokeStatic(core.clj:6958)
     at clojure.core$into.invoke(core.clj:6950)
     at libpython_clj2.metadata$datafy_module_or_class.invokeStatic(metadata.clj:236)
     at libpython_clj2.metadata$datafy_module_or_class.invoke(metadata.clj:230)
     at libpython_clj2.metadata$eval31858$fn__31859.invoke(metadata.clj:261)
     at clojure.lang.MultiFn.invoke(MultiFn.java:229)
     at libpython_clj2.python.bridge_as_jvm$generic_pyobject$reify__27599.datafy(bridge_as_jvm.clj:436)
     at clojure.datafy$datafy.invokeStatic(datafy.clj:23)
     at clojure.datafy$datafy.invoke(datafy.clj:15)
     at libpython_clj2.require$do_require_python.invokeStatic(require.clj:130)
     at libpython_clj2.require$do_require_python.invoke(require.clj:86)
     at libpython_clj2.require$require_python$fn__31971.invoke(require.clj:273)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:265)
     at libpython_clj2.require$require_python.invoke(require.clj:173)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:278)
     at libpython_clj2.require$require_python.doInvoke(require.clj:173)
     at clojure.lang.RestFn.applyTo(RestFn.java:139)
     at clojure.core$apply.invokeStatic(core.clj:667)
     at clojure.core$apply.invoke(core.clj:662)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:280)
     at libpython_clj2.require$require_python.doInvoke(require.clj:173)
     at clojure.lang.RestFn.applyTo(RestFn.java:139)
     at clojure.core$apply.invokeStatic(core.clj:667)
     at clojure.core$apply.invoke(core.clj:662)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:280)
     at libpython_clj2.require$require_python.doInvoke(require.clj:173)
     at clojure.lang.RestFn.applyTo(RestFn.java:139)
     at clojure.core$apply.invokeStatic(core.clj:667)
     at clojure.core$apply.invoke(core.clj:662)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:280)
     at libpython_clj2.require$require_python.doInvoke(require.clj:173)
     at clojure.lang.RestFn.applyTo(RestFn.java:139)
     at clojure.core$apply.invokeStatic(core.clj:667)
     at clojure.core$apply.invoke(core.clj:662)
     at libpython_clj2.require$require_python.invokeStatic(require.clj:280)
     at libpython_clj2.require$require_python.doInvoke(require.clj:173)
     at clojure.lang.RestFn.invoke(RestFn.java:628)
     at libpython_test.core$eval32052.invokeStatic(core.clj:29)
     at libpython_test.core$eval32052.invoke(core.clj:29)
     at clojure.lang.Compiler.eval(Compiler.java:7194)
     at clojure.lang.Compiler.load(Compiler.java:7653)
     ... 50 more

I set up a basic reproduction of this error at https://github.com/metasoarous/libpython-require-error in the require-init-bug branch (you'll note that this is the same repo as I put together for #212, only I switched to jdk-8 to get baseline functionality working again).

Any idea why I'm seeing this behavior, and if so how I might avoid it?

jjtolton commented 2 years ago

This is hard to say without looking at some more code. What I can tell you is that TypeError: must be real number, not Float is a Python error, not a libpython-clj error per se. While it is possible that there is a datatype conversion error, my intuition is that there is an error in the Python code.

metasoarous commented 2 years ago

This is hard to say without looking at some more code.

Here's the specific code, which fails depending on how I evaluate:

https://github.com/metasoarous/libpython-require-error/blob/require-init-bug/src/libpython_test/core.clj

What I can tell you is that TypeError: must be real number, not Float is a Python error, not a libpython-clj error per se. While it is possible that there is a datatype conversion error, my intuition is that there is an error in the Python code.

OK; Interesting. Any idea why this error would only crop up when required from a secondary/dependent namespace, but not when evaluated directly?

Just for additional context, the code all works when I directly evaluate the libpython-test.core namespace first. After that, I can require libpython-test.core from dev/user.clj, and it works fine. But if I try to require from dev/user.clj before directly evaluating libpython-test.core I get the error above.

cnuernber commented 2 years ago

It is really tough to debug this as we don't know which specific require is causing the drama. It is happening during the scan-metadata phase of require-python so there must be some issue with scanning a given module more than once.

One work-around is to not use require-python but to use raw calls using the base python api. Another would be to statically generate bindings to those modules offline and use the static classes. Aside from that we would need to know exactly which module is getting irritated when we are scanning its attributes and sub-objects.

We are doing a lot of magic in require-python - static code generation is a more conservative route which uses the same metadata system and then using the base python api via libpython-clj2.python is the most conservative route.

jjtolton commented 2 years ago

Yeah very interesting. I wouldn’t mind diving into this a bit more to see if it’s a libpython-clj stability issue if you're willing to be a bit patient, Chris S. We can take it offline if there’s anything sensitive. I’ll take a look at the repo you mentioned and see if anything stands out.

On Sat, Sep 3, 2022 at 6:58 PM Chris Nuernberger @.***> wrote:

It is really tough to debug this as we don't know which specific require is causing the drama. It is happening during the scan-metadata phase of require-python so there must be some issue with scanning a given module more than once.

One work-around is to not use require-python but to use raw calls using the base python api. Another would be to statically generate bindings to those modules offline and use the static classes https://clj-python.github.io/libpython-clj/libpython-clj2.codegen.html#var-write-namespace.21. Aside from that we would need to know exactly which module is getting irritated when we are scanning its attributes and sub-objects.

We are doing a lot of magic in require-python - static code generation is a more conservative route which uses the same metadata system and then using the base python api via libpython-clj2.python is the most conservative route.

— Reply to this email directly, view it on GitHub https://github.com/clj-python/libpython-clj/issues/213#issuecomment-1236210355, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPJX4673COPMBWM5PU7CULV4PJXVANCNFSM6AAAAAAQEBKDQU . You are receiving this because you commented.Message ID: @.***>

jjtolton commented 2 years ago

@metasoarous one more quick question-- are you running in embedded mode? If not, would you be willing to try? It tends to be all around more stable than kicking off from Clojure

metasoarous commented 2 years ago

Again, thank you both for your time!

I'm not running in embed mode, but I'd be willing to give that a shot. I'll also take a look at compiling to static classes.

There's nothing sensitive here. The example repo I created is a minimalish extraction of the relevant code in https://github.com/compdemocracy/polis-analysis, so it's all nonprofit/oss :-)

Just thinking out loud a bit: I wonder if this might be specifically related to numba, which is required for implementing custom umap metrics. It seems to be a bit hairy in places, and gave me some hell on the dependency side of things just getting the python environment all set up (in particular, was somehow not installable on the clojure:jdk-11-tools-deps image using the steps currently in the Dockerfile). It's doing some very fancy stuff too, so I could easily see it causing this kind of problem.

metasoarous commented 2 years ago

I tried splitting up calls to require-python tonight, and indeed, my suspicion was correct; It gets through everything up to setting up numba.

I updated my reproduction repo (master branch), and the action right now is these two files:

jjtolton commented 2 years ago

One thing to keep in mind is that the libpython-clj2.require namespace initializes python via py/initialize!. This may not always be desirable.

jjtolton commented 2 years ago

Will take a look today and see if I can spot anything

jjtolton commented 2 years ago

@metasoarous did you happen to see this? https://github.com/clj-python/libpython-clj/issues/194#issuecomment-1237966634

Found a possible instability when using lower level libpython-clj import functions