JuliaPy / PythonCall.jl

Python and Julia in harmony.
https://juliapy.github.io/PythonCall.jl/stable/
MIT License
715 stars 61 forks source link

`numpy` arrays cannot be serialized when used in `pmap` context #459

Closed ma-sadeghi closed 4 months ago

ma-sadeghi commented 4 months ago

Affects: JuliaCall

Describe the bug I'm trying to call a Julia function (from Python) that returns a vector of objects generated via pmap on multiple workers. (Not sure if the pmap is even relevant, just in case)

Reproduce the bug

import numpy as np
from juliacall import Main as jl

script = """using Distributed
# a = Array(a)
n = 2
_workers = WorkerPool(addprocs(n))
results = pmap(_ -> sum(a), _workers, 1:n)
println(results)
"""
script = ";".join([s for s in script.split("\n") if not s.startswith("#")])

def fn_parallel():
    jl.a = np.random.rand(5)
    jl.seval(script)

fn_parallel()

More context The issue is related to passing numpy arrays (no error when passing say a = 1). If the input is explicitly converted to Array, it runs fine (uncomment # a = Array). See #454 for the original issue.

Error message

Python: TypeError: cannot pickle 'PyCapsule' object   core.py:432
Python stacktrace: none                                                                     
Stacktrace:                                                                                 
[1] pythrow()                                                                             
    @ PythonCall ~/.julia/packages/PythonCall/wXfah/src/err.jl:94                           
[2] errcheck                                                                              
    @ ~/.julia/packages/PythonCall/wXfah/src/err.jl:10 [inlined]                            
[3] pycallargs                                                                            
    @ ~/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:210 [inlined]               
[4] pycall(f::Py, args::Py; kwargs::@Kwargs{})                                            
    @ PythonCall ~/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:228              
[5] pycall                                                                                
    @ ~/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:218 [inlined]               
[6] Py                                                                                    
    @ ~/.julia/packages/PythonCall/wXfah/src/Py.jl:341 [inlined]                            
[7] serialize_py(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Py)              
    @ PythonCall                                                                            
~/.julia/packages/PythonCall/wXfah/src/compat/serialization.jl:9                            
[8] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Py)                 
    @ PythonCall                                                                            
~/.julia/packages/PythonCall/wXfah/src/compat/serialization.jl:25                           
[9] serialize_any(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)            
    @ Serialization                                                                         
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Serializa            
tion/src/Serialization.jl:676                                                               
[10] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)                
    @ Serialization                                                                         
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Serializa            
tion/src/Serialization.jl:655                                                               
[11] serialize_any(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)            
    @ Serialization                                                                         
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Serializa            
tion/src/Serialization.jl:676                                                               
[12] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)                
    @ Serialization                                                                         
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Serializa            
tion/src/Serialization.jl:655                                                               
[13] serialize_msg(s::Distributed.ClusterSerializer{Sockets.TCPSocket},                    
o::Distributed.CallMsg{:call_fetch})                                                        
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/messages.jl:78                                                                       
[14] #invokelatest#2                                                                       
    @ ./essentials.jl:887 [inlined]                                                         
[15] invokelatest                                                                          
    @ ./essentials.jl:884 [inlined]                                                         
[16] send_msg_(w::Distributed.Worker, header::Distributed.MsgHeader,                       
msg::Distributed.CallMsg{:call_fetch}, now::Bool)                                           
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/messages.jl:181                                                                      
[17] send_msg                                                                              
    @                                                                                       
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/messages.jl:122 [inlined]                                                            
[18] remotecall_fetch(f::Function, w::Distributed.Worker, args::Int64;                     
kwargs::@Kwargs{})                                                                          
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/remotecall.jl:460                                                                    
[19] remotecall_fetch(f::Function, w::Distributed.Worker, args::Int64)                     
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/remotecall.jl:454                                                                    
[20] remotecall_fetch(f::Function, id::Int64, args::Int64)                                 
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/remotecall.jl:492                                                                    
[21] remotecall_pool(rc_f::Function, f::Function, pool::Distributed.WorkerPool,            
args::Int64; kwargs::@Kwargs{})                                                             
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/workerpool.jl:126                                                                    
[22] remotecall_pool                                                                       
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/workerpool.jl:123 [inlined]                                                          
[23] remotecall_fetch                                                                      
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/workerpool.jl:232 [inlined]                                                          
[24] #208                                                                                  
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/workerpool.jl:288 [inlined]                                                          
[25]                                                                                       
(::Base.var"#1023#1028"{Distributed.var"#208#210"{Distributed.var"#208#209#211"{            
Distributed.WorkerPool, EquivalentCircuits.var"#177#178"{Int64, Int64, String,              
Int64, Float64, Nothing, Float64, Nothing, PyArray{ComplexF64, 1, true, true,               
ComplexF64}, PyArray{Float64, 1, true, true,                                                
Float64}}}}})(r::Base.RefValue{Any}, args::Tuple{Int64})                                    
    @ Base ./asyncmap.jl:94                                                                 
[26]                                                                                       
(::Base.var"#1039#1040"{Base.var"#1023#1028"{Distributed.var"#208#210"{Distribut            
ed.var"#208#209#211"{Distributed.WorkerPool,                                                
EquivalentCircuits.var"#177#178"{Int64, Int64, String, Int64, Float64, Nothing,             
Float64, Nothing, PyArray{ComplexF64, 1, true, true, ComplexF64},                           
PyArray{Float64, 1, true, true, Float64}}}}}, Channel{Any}, Nothing})()                     
    @ Base ./asyncmap.jl:228                                                                
Stacktrace:                                                                                 
[1] (::Base.var"#1033#1035")(x::Task)                                                     
    @ Base ./asyncmap.jl:171                                                                
[2] foreach(f::Base.var"#1033#1035", itr::Vector{Any})                                    
    @ Base ./abstractarray.jl:3094                                                          
[3] maptwice(wrapped_f::Function, chnl::Channel{Any},                                     
worker_tasks::Vector{Any}, c::UnitRange{Int64})                                             
    @ Base ./asyncmap.jl:171                                                                
[4] wrap_n_exec_twice                                                                     
    @ ./asyncmap.jl:147 [inlined]                                                           
[5] #async_usemap#1018                                                                    
    @ ./asyncmap.jl:97 [inlined]                                                            
[6] kwcall(::NamedTuple, ::typeof(Base.async_usemap), f::Any, c::Vararg{Any})             
    @ Base ./asyncmap.jl:78 [inlined]                                                       
[7] #asyncmap#1017                                                                        
    @ ./asyncmap.jl:75 [inlined]                                                            
[8] asyncmap                                                                              
    @ ./asyncmap.jl:74 [inlined]                                                            
[9] pmap(f::Function, p::Distributed.WorkerPool, c::UnitRange{Int64};                     
distributed::Bool, batch_size::Int64, on_error::Nothing,                                    
retry_delays::Vector{Any}, retry_check::Nothing)                                            
    @ Distributed                                                                           
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/pmap.jl:126                                                                          
[10] pmap                                                                                  
    @                                                                                       
~/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Distribut            
ed/src/pmap.jl:99 [inlined]                                                                 
[11] circuit_evolution_batch(measurements::PyArray{ComplexF64, 1, true, true,              
ComplexF64}, frequencies::PyArray{Float64, 1, true, true, Float64};                         
generations::Int64, population_size::Int64, terminals::String, head::Int64,                 
cutoff::Float64, initial_population::Nothing, convergence_threshold::Float64,               
bounds::Nothing, numprocs::Int64, iters::Int64, quiet::Bool)                                
    @ EquivalentCircuits                                                                    
~/Code/EquivalentCircuits.jl/src/CircuitEvolution.jl:381                                    
[12] pyjlany_call(self::typeof(circuit_evolution_batch), args_::Py,                        
kwargs_::Py)                                                                                
    @ PythonCall ~/.julia/packages/PythonCall/wXfah/src/jlwrap/any.jl:34                    
[13] _pyjl_callmethod(f::Any, self_::Ptr{PythonCall.C.PyObject},                           
args_::Ptr{PythonCall.C.PyObject}, nargs::Int64)                                            
    @ PythonCall ~/.julia/packages/PythonCall/wXfah/src/jlwrap/base.jl:69                   
[14] _pyjl_callmethod(o::Ptr{PythonCall.C.PyObject},                                       
args::Ptr{PythonCall.C.PyObject})                                                           
        @ PythonCall.C ~/.julia/packages/PythonCall/wXfah/src/cpython/jlwrap.jl:47      

Your system Please provide detailed information about your system:

❯ pip list
Package                       Version      Editable project location
----------------------------- ------------ -------------------------
alabaster                     0.7.13
altair                        5.2.0
arviz                         0.17.0
asttokens                     2.4.1
attrs                         23.2.0
autoeis                       0.0.19       /home/amin/Code/AutoEIS
Babel                         2.13.0
beautifulsoup4                4.12.2
click                         8.1.7
comm                          0.2.1
contourpy                     1.2.0
cycler                        0.12.1
debugpy                       1.8.0
decorator                     5.1.1
dill                          0.3.8
docutils                      0.20.1
exceptiongroup                1.2.0
executing                     2.0.1
fonttools                     4.47.2
h5netcdf                      1.3.0
h5py                          3.10.0
imagesize                     1.4.1
impedance                     1.7.1
iniconfig                     2.0.0
ipykernel                     6.29.2
ipython                       8.20.0
ipywidgets                    8.1.1
jax                           0.4.23
jaxlib                        0.4.23
jedi                          0.19.1
Jinja2                        3.1.2
joblib                        1.3.2
jsonschema                    4.20.0
jsonschema-specifications     2023.12.1
julia                         0.6.2
juliacall                     0.9.15
juliapkg                      0.1.10
jupyter_client                8.6.0
jupyter_core                  5.7.1
jupyterlab-widgets            3.0.9
kiwisolver                    1.4.5
markdown-it-py                3.0.0
MarkupSafe                    2.1.3
matplotlib                    3.8.2
matplotlib-inline             0.1.6
mdurl                         0.1.2
ml-dtypes                     0.3.2
mpire                         2.9.0
multipledispatch              1.0.0
multiprocess                  0.70.16
nest-asyncio                  1.6.0
numpy                         1.26.3
numpyro                       0.13.2
opt-einsum                    3.3.0
packaging                     23.2
pandas                        2.1.4
parso                         0.8.3
pexpect                       4.9.0
pillow                        10.2.0
pip                           23.3.2
platformdirs                  4.2.0
pluggy                        1.4.0
prompt-toolkit                3.0.43
psutil                        5.9.8
ptyprocess                    0.7.0
pure-eval                     0.2.2
Pygments                      2.16.1
pyparsing                     3.1.1
pytest                        8.0.0
python-dateutil               2.8.2
pytz                          2023.3.post1
pyzmq                         25.1.2
referencing                   0.32.1
rich                          13.7.0
rpds-py                       0.17.1
ruff                          0.2.1
scikit-learn                  1.4.0
scipy                         1.11.4
seaborn                       0.13.1
semantic-version              2.10.0
setuptools                    69.0.3
six                           1.16.0
snowballstemmer               2.2.0
soupsieve                     2.5
Sphinx                        7.2.6
sphinx-basic-ng               1.0.0b2
sphinxcontrib-applehelp       1.0.7
sphinxcontrib-devhelp         1.0.5
sphinxcontrib-htmlhelp        2.0.4
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          1.0.6
sphinxcontrib-serializinghtml 1.1.9
stack-data                    0.6.3
threadpoolctl                 3.2.0
tomli                         2.0.1
toolz                         0.12.0
tornado                       6.4
tqdm                          4.66.1
traitlets                     5.14.1
typing_extensions             4.9.0
tzdata                        2023.4
wcwidth                       0.2.13
wheel                         0.42.0
widgetsnbextension            4.0.9
xarray                        2023.12.0
xarray-einstats               0.6.0
>>> juliacall.Pkg.status()
Status `~/mambaforge/envs/autoeis/julia_env/Project.toml`
  [da5bd070] EquivalentCircuits v0.3.1 `~/Code/EquivalentCircuits.jl`
  [6099a3de] PythonCall v0.9.15
cjdoris commented 4 months ago

This is now fixed on main - the underlying issue being that PyArray was not serializable. Though unless you really do need it to be a PyArray I'd recommend just converting it to Array as you had done.

ma-sadeghi commented 4 months ago

Great, thanks! Just a quick question: Do you see #424 being worked on in the forseeable future? Thanks!

cjdoris commented 4 months ago

I have no particular plans to do it - there are other more pressing issues for PythonCall. But would be quite simple if you want to give it a go.

ma-sadeghi commented 4 months ago

Sure, I'd be happy to take a crack at it. Any pointers where to look/start?

cjdoris commented 4 months ago

Take a look at serialization.jl. Pretty much you'd need to add a setting to PythonCall, and if that setting is set you'd import dill instead of pickle.