Open marius311 opened 2 years ago
That would be so awesome!
Here is cjdoris's response to marius311 on the topic: https://discourse.julialang.org/t/ann-pythoncall-and-juliacall/76778/16
I’ve encountered that issue before in pyjulia but don’t actually know its cause.
I imagine the difference is in how the packages load libpython. In JuliaCall, we pass ctypes.pythonapi._handle to PythonCall, which is a pointer to an already-open libpython. I assume PyJulia/PyCall opens libpython itself.
Indeed, he's right: https://docs.python.org/3/library/ctypes.html
ctypes.pythonapi An instance of PyDLL that exposes Python C API functions as attributes. Note that all these functions are assumed to return C int, which is of course not always the truth, so you have to assign the correct restype attribute to use these functions.
$ ldd `which python`
linux-vdso.so.1 (0x00007ffe269cb000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fcf5c547000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fcf5c53f000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fcf5c537000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fcf5c52f000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fcf5c447000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fcf5c21f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fcf5c917000)
$ `which python`
Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:06:46) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes
>>> ctypes.pythonapi
<PyDLL 'None', handle 7f106618a2e0 at 0x7f10655c2980>
@mkitti so PyCall/pyjulia could do that as well?
I think so. We technically just need the pointer.
Oh that would be awesome! I guess Packages like PySr (@MilesCranmer), diffeqpy (@ChrisRackauckas) and so on would profit a lot from that as well.
I would like to review the situation here.
Part of the issue is that pyjulia is only half of the equation here. The other half is PyCall.jl.
In https://github.com/JuliaPy/PyCall.jl/issues/612, they were trying to load the python
executable as libpython due to PIE (Position Independent Executables).
In the linked comment above, @cjdoris demonstrates that we do not need to load python executable or libpython since we could just reuse ctypes.pythonapi._handle
as is done in juliacall / PythonCall. In juliacall, the pointer is passed through an environment variable.
ctypes.pythonapi._handle
loaded when Python is statically linked to libpython?Looking into ctypes
we see that pythonapi
is set to PyDLL(None)
. The name
argument and the _name
field of PyDLL, a subclass of CDLL is set to None
.
>>> import ctypes
>>> ctypes.pythonapi
<PyDLL 'None', handle 7f084713e2e0 at 0x7f08464d3e10>
>>> ctypes.pythonapi._name
>>> ctypes.pythonapi._name == None
True
_name
is subsequently passed to _dlopen
which on POSIX systems is just libdl C routine dlopen
.
If we look at the man page for dlopen(3)
we see this call to dlopen
will return a handle to the executable.
If filename is NULL, then the returned handle is for the main program.
dlopen
in Julia?This suggests that we can use dlopen
from Julia to obtain the same pointer. While there are a few layers of indirection involved, passing an empty string to Julia's Libdl.dlopen
appears to work.
# Start from ipython
In [1]: import ctypes
In [2]: hex(ctypes.pythonapi._handle)
Out[2]: '0x7f054fcae2e0'
In [3]: from julia.api import LibJulia
In [4]: api = LibJulia.load()
In [5]: api.init_julia()
In [6]: api
Out[6]: <julia.libjulia.LibJulia at 0x7f054c735fd0>
# Launch Julia REPL from Python
In [7]: api.jl_eval_string(b"""
...: import REPL;
...: term = REPL.Terminals.TTYTerminal("dumb", stdin, stdout, stderr);
...: repl = REPL.LineEditREPL(term, true);
...: REPL.run_repl(repl);
...: """)
julia> using Libdl
julia> python_ptr = dlopen("")
Ptr{Nothing} @0x00007f054fcae2e0
We see above that the pointer from ctypes.pythonapi._handle
is exactly the same pointer we obtain by invoking Libdl.dlopen("")
in Julia.
julia> Py_IsInitialized = dlsym(python_ptr, :Py_IsInitialized)
Ptr{Nothing} @0x0000555d21fd3890
julia> ccall(Py_IsInitialized, Cint, ())
1
julia> Py_GetVersion = dlsym(python_ptr, :Py_GetVersion)
Ptr{Nothing} @0x0000555d21fe0580
julia> ccall(Py_GetVersion, Cstring, ()) |> unsafe_string
"3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]"
We can obtain ctypes.pythonapi._handle
by calling dlopen("")
in Julia when started from Python. For juliacall
an environment variable may not have be used to transmit the pointer. For pyjulia
and PyCall.jl this simplifies the method to obtain pythonapi
pointer.
That's cool!
I just took a quick look from JuliaCall and it's true dlopen("")
returns the same handle on Linux, but it throws an error on Windows:
could not load library ""
The parameter is incorrect.
Plus the behaviour of dlopen("")
is undocumented, so personally I'm steering clear of it.
While I agree that dlopen("")
is undocumented at the Julia API level, it does correspond to the documented behavior at thr C API level.
The use of ctypes.pythonapi._handle
is also equally undocumented. The underlying mechanism basically depends on the same behavior.
Actually ctypes.pythonapi
is documented to be a PyDLL
and PyDLL._handle
is documented to be the system handle - in this case the underscore is not indicating an internal attribute, but is to avoid name clashes with symbols in the DLL.
You're right, I concede the point.
https://docs.python.org/3/library/ctypes.html#ctypes.PyDLL._handle
Also dlopen("")
does not work on macOS and really should be dlopen(C_NULL)
which doesn't work. See https://github.com/JuliaLang/julia/issues/22318. One would have to do
ccall(:jl_load_dynamic_library, Ptr{Cvoid}, (Ptr{Nothing},UInt32,Cint), C_NULL, RTLD_GLOBAL, Cint(1))
That does work.
On macOS ctypes.pythonapi._handle
is 0xfffffffffffffffe
.
In [1]: import ctypes
In [2]: ctypes.pythonapi._handle
Out[2]: 18446744073709551614
In [3]: hex(ctypes.pythonapi._handle)
Out[3]: '0xfffffffffffffffe'
This is actually the value of RTLD_DEFAULT
: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/dlsym.3.html
If dlsym() is called with the special handle RTLD_DEFAULT, then all mach-o macho o images in the process (except those loaded with dlopen(xxx, RTLD_LOCAL)) are searched in the order they were loaded. This can be a costly search and should be avoided.
🤯
I've never actually tried JuliaCall on Mac. I wonder if it works. I should really set up tests and CI.
Edit: It works fine! And indeed the handle is that special value.
That very last sentence ("this can be a costly search") may explain why loading in ~100 symbols takes so long in PythonCall (~1sec), one reason why PyCall is much faster to load.
On macOS, you can just dlopen
the executable. At the moment the timing does not look terrible.
In [1]: from julia.api import LibJulia
In [2]: api = LibJulia.load()
In [3]: api.init_julia()
In [4]: api.jl_eval_string(b"""
...: import REPL;
...: term = REPL.Terminals.TTYTerminal("dumb", stdin, stdout, stderr);
...: repl = REPL.LineEditREPL(term, true);
...: REPL.run_repl(repl);
...: """)
julia> python_path = ccall(:_dyld_get_image_name, Cstring, (UInt32,), 0) |> unsafe_string
"~/miniforge3-x86_64/envs/pyjulia_test_x86_64/bin/python3.11"
julia> python_handle = dlopen(python_path)
Ptr{Nothing} @0x000000021ba297e0
julia> Py_IsInitialized = dlsym(python_handle, :Py_IsInitialized)
Ptr{Nothing} @0x0000000104d78020
julia> ccall(Py_IsInitialized, Cint, ())
1
julia> @btime dlsym(python_handle, :Py_IsInitialized)
253.612 ns (1 allocation: 16 bytes)
Ptr{Nothing} @0x0000000104d78020
julia> RTLD_DEFAULT = Ptr{Nothing}(0xfffffffffffffffe)
Ptr{Nothing} @0xfffffffffffffffe
julia> @btime dlsym(RTLD_DEFAULT, :Py_IsInitialized)
270.565 ns (1 allocation: 16 bytes)
Ptr{Nothing} @0x0000000104d78020
PyCall.jl does a lot of symbol loading during precompilation. That is also going to make it difficult for using this pointer though and is also why it doesn't work with a statically linked python executable unless compiled_modules = false
(e.g. no precompilation).
My thought is that this could benefit from a lazy symbol loading scheme such as the one I put into GR.jl: https://github.com/jheinen/GR.jl/blob/db3e5f53738be892b23317d673179a32b0e50910/src/funcptrs.jl#L74-L86
My thought is that this could benefit from a lazy symbol loading scheme such as the one I put into GR.jl: https://github.com/jheinen/GR.jl/blob/db3e5f53738be892b23317d673179a32b0e50910/src/funcptrs.jl#L74-L86
Yeah thanks, I've got something similar in a branch somewhere....
I'm not an expert and don't know the internals, but is there a reason PyCall can't do whatever PythonCall / juliacall does that lets the user use any Python executable, including ones with a statically linked libpython? Is there anything preventing what they're doing to be used here? A probably related question posted here: https://github.com/JuliaPy/PyCall.jl/issues/988