ICB-DCM / pyABC

distributed, likelihood-free inference
https://pyabc.rtfd.io
BSD 3-Clause "New" or "Revised" License
206 stars 44 forks source link

PyJulia example works on single-core but not multi-core #580

Closed shenkev closed 2 years ago

shenkev commented 2 years ago

I'm following the example: https://pyabc.readthedocs.io/en/latest/examples/using_julia.html?highlight=julia

I've put the code in the example in a python file test.py shown below. SIR.jl is exactly the same as in the example. Running test.py with SingleCoreSampler works. But MulticoreEvalParallelSampler causes a crash (bottom of issue).

Why is this?

I can successfully

  1. Run MulticoreEvalParallelSampler using a python model instead of a julia one
  2. Run SingleCoreSampler on a julia model

Note: I'm having compatibility issues between Julia and Python and used the recommended solution on the PyJulia webpage:

from julia.api import Julia
jl = Julia(compiled_modules=False)

Could this be the reason?

import tempfile

import matplotlib.pyplot as plt

import pyabc
from pyabc import ABCSMC, RV, Distribution, MulticoreEvalParallelSampler, SingleCoreSampler
from julia.api import Julia
jl = Julia(compiled_modules=False)
from pyabc.external.julia import Julia

pyabc.settings.set_figure_params('pyabc')  # for beautified plots

jl = Julia(module_name="SIR", source_file="SIR.jl")

model = jl.model()
distance = jl.distance()
obs = jl.observation()

gt_par = {"p1": -4.0, "p2": -2.0}

par_limits = {
    "p1": (-5, -3),
    "p2": (-3, -1),
}
prior = Distribution(
    **{key: RV("uniform", lb, ub - lb) for key, (lb, ub) in par_limits.items()}
)

abc = ABCSMC(
    model,
    prior,
    distance,
    sampler=MulticoreEvalParallelSampler(),
)
db = tempfile.mkstemp(suffix=".db")[1]
abc.new("sqlite:///" + db, obs)
h = abc.run(max_nr_populations=10)

Full error log here error.txt

Shortened version below,

jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_add_to_ee at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1059
convert at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/conversions.jl:835
julia_args at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:18 [inlined]
_pyjlwrap_call at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:24
unknown function (ip: 0x7f36e6d0269c)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
pyjlwrap_call at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:44
unknown function (ip: 0x7f36e6cfdbd0)
jl_add_to_ee at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1103
jl_add_to_ee at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1125 [inlined]
_jl_compile_codeinst at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:154
jl_generate_fptr at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:350
_PyObject_MakeTpCall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:159
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:125 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3469
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
_PyObject_FastCallDict at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:96
_ZN4llvm3orc25InProgressFullLookupState8completeESt10unique_ptrINS0_21InProgressLookupStateESt14default_deleteIS3_EE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm3orc16ExecutionSession19OL_applyQueryPhase1ESt10unique_ptrINS0_21InProgressLookupStateESt14default_deleteIS3_EENS_5ErrorE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
jl_compile_method_internal at /buildworker/worker/package_linux64/build/src/gf.c:1980
_PyObject_Call_Prepend at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:888 [inlined]
slot_tp_call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/typeobject.c:6556
_ZN4llvm3orc16ExecutionSession6lookupENS0_10LookupKindERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS8_EENS0_15SymbolLookupSetENS0_11SymbolStateENS_15unique_functionIFvNS_8ExpectedINS_8DenseMapINS0_15SymbolStringPtrENS_18JITEvaluatedSymbolENS_12DenseMapInfoISI_EENS_6detail12DenseMapPairISI_SJ_EEEEEEEEESt8functionIFvRKNSH_IS6_NS_8DenseSetISI_SL_EENSK_IS6_EENSN_IS6_SV_EEEEEE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_PyObject_MakeTpCall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:159
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:125 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3469
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3486
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3486
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
_PyEval_EvalCodeWithName at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4298
_ZN4llvm3orc16ExecutionSession6lookupERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS7_EERKNS0_15SymbolLookupSetENS0_10LookupKindENS0_11SymbolStateESt8functionIFvRKNS_8DenseMapIS5_NS_8DenseSetINS0_15SymbolStringPtrENS_12DenseMapInfoISK_EEEENSL_IS5_EENS_6detail12DenseMapPairIS5_SN_EEEEEE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:436
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3500
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
PyVectorcall_Call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:200 [inlined]
PyObject_Call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:228
do_call_core at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:5010 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3559
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3486
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
_PyEval_EvalCodeWithName at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4298
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:436
_ZN4llvm3orc16ExecutionSession6lookupERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS7_EENS0_15SymbolStringPtrENS0_11SymbolStateE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm3orc16ExecutionSession6lookupENS_8ArrayRefIPNS0_8JITDylibEEENS0_15SymbolStringPtrENS0_11SymbolStateE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm3orc16ExecutionSession6lookupENS_8ArrayRefIPNS0_8JITDylibEEENS_9StringRefENS0_11SymbolStateE at /home/tingkeshenlocal/Projects/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
method_vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/classobject.c:60
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3515
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3486
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
function_code_fastcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:284 [inlined]
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:411
jl_compile_method_internal at /buildworker/worker/package_linux64/build/src/gf.c:2246 [inlined]
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2239 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
convert at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/conversions.jl:835
julia_args at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:18 [inlined]
_pyjlwrap_call at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:24
unknown function (ip: 0x7f36e6d0269c)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
pyjlwrap_call at /home/tingkeshenlocal/.julia/packages/PyCall/7a7w0/src/callback.jl:44
unknown function (ip: 0x7f36e6cfdbd0)
_PyObject_FastCallDict at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:96 [inlined]
_PyObject_Call_Prepend at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:888 [inlined]
slot_tp_init at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/typeobject.c:6790
type_call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/typeobject.c:994 [inlined]
_PyObject_MakeTpCall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:159
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:125 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]

PyVectorcall_Call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:200 [inlined]
PyObject_Call at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:228
do_call_core at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:5010 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3559
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
_PyEval_EvalCodeWithName at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4298
_PyFunction_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/call.c:436
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
method_vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Objects/classobject.c:60
_PyObject_Vectorcall at /opt/conda/conda-bld/python-split_1648465063888/work/Include/cpython/abstract.h:127 [inlined]
call_function at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4963 [inlined]
_PyEval_EvalFrameDefault at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:3515
PyEval_EvalFrameEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:741 [inlined]
_PyEval_EvalCodeWithName at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4298
PyEval_EvalCodeEx at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:4327 [inlined]
PyEval_EvalCode at /opt/conda/conda-bld/python-split_1648465063888/work/Python/ceval.c:718
run_eval_code_obj at /opt/conda/conda-bld/python-split_1648465063888/work/Python/pythonrun.c:1166
run_mod at /opt/conda/conda-bld/python-split_1648465063888/work/Python/pythonrun.c:1188
pyrun_file at /opt/conda/conda-bld/python-split_1648465063888/work/Python/pythonrun.c:1085
pyrun_simple_file at /opt/conda/conda-bld/python-split_1648465063888/work/Python/pythonrun.c:439 [inlined]
PyRun_SimpleFileExFlags at /opt/conda/conda-bld/python-split_1648465063888/work/Python/pythonrun.c:472
pymain_run_file at /opt/conda/conda-bld/python-split_1648465063888/work/Modules/main.c:391 [inlined]
pymain_run_python at /opt/conda/conda-bld/python-split_1648465063888/work/Modules/main.c:616 [inlined]
Py_RunMain at /opt/conda/conda-bld/python-split_1648465063888/work/Modules/main.c:695
Py_BytesMain at /opt/conda/conda-bld/python-split_1648465063888/work/Modules/main.c:1127
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/tingkeshenlocal/miniconda3/envs/py38/bin/python (unknown line)
Allocations: 385777276 (Pool: 385661572; Big: 115704); GC: 319
ABC.History INFO: Done <ABCSMC id=1, duration=0:00:05.469356, end_time=2022-06-24 15:16:12>
Traceback (most recent call last):
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/sampler/multicorebase.py", line 103, in get_if_worker_healthy
    item = queue.get(True, 5)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/multiprocessing/queues.py", line 108, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tingkeshenlocal/Projects/risk-aversive-exploration/abc/temp.py", line 38, in <module>
    h = abc.run(max_nr_populations=10)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 62, in wrapped_run
    ret = run(self, *args, **kwargs)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 685, in run
    t0: int = self.initialize_components_before_run(
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 766, in initialize_components_before_run
    self._initialize_dist_eps_acc(t0)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 492, in _initialize_dist_eps_acc
    self.eps.initialize(
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/epsilon/epsilon.py", line 152, in initialize
    weighted_distances = get_weighted_distances()
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 458, in get_initial_weighted_distances
    population = _get_initial_population_with_distances()
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 449, in _get_initial_population_with_distances
    population = self._get_initial_population(t - 1)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 523, in _get_initial_population
    population = self._sample_from_prior(t)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/inference/smc.py", line 551, in _sample_from_prior
    sample = self.sampler.sample_until_n_accepted(
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/sampler/base.py", line 20, in sample_until_n_accepted
    sample = f(self, n, simulate_one, t, **kwargs)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/sampler/multicore_evaluation_parallel.py", line 142, in sample_until_n_accepted
    val = get_if_worker_healthy(processes, queue)
  File "/home/tingkeshenlocal/miniconda3/envs/py38/lib/python3.8/site-packages/pyabc/sampler/multicorebase.py", line 107, in get_if_worker_healthy
    raise ProcessError("At least one worker is dead.")
multiprocessing.context.ProcessError: At least one worker is dead.
yannikschaelte commented 2 years ago

Hi @shenkev , sorry for the delay. One thing that you need to do (as described in the example notebook), is to call all julia functions once, such that necessary pre-compilation is performed, as otherwise problems will occur when using multiprocessing. This can be done e.g. by inserting a line distance(model(gt_par), model(gt_par)) prior to the main pyABC routine.

With that addition, your code runs fine for me.

yannikschaelte commented 2 years ago

closing due to non-response. feel free to re-open if issue persists.