matthiaskoenig / sbmlsim

sbmlsim: SBML simulation made easy
GNU Lesser General Public License v3.0
5 stars 4 forks source link

libroadrunner==2.2.0 results in segmentation faults on almost all my models/workflows #146

Open matthiaskoenig opened 2 years ago

matthiaskoenig commented 2 years ago

Not sure what happened since around mid of December, but the latest libroadrunner==2.2.0 and libroadrunner-experimental==2.2.0 results in segmentation faults and dying linux kernels on most of my workflows/simulations.

I.e. I get things such as

*** SIGSEGV received at time=1644920070 on cpu 10 ***
PC: @     0x7fcad4f2851e  (unknown)  std::default_delete<>::operator()()
    @     0x7fcb388d93c0  1075403472  (unknown)
    @     0x7fcad4f26bde         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4f6c636         32  rrllvm::Jit::~Jit()
    @     0x7fcad4f80250         32  rrllvm::MCJit::~MCJit()
    @     0x7fcad4f8026c         32  rrllvm::MCJit::~MCJit()
    @     0x7fcad4efcb16         32  std::default_delete<>::operator()()
    @     0x7fcad4efc2be         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4f45f90        848  rrllvm::ModelResources::~ModelResources()
    @     0x7fcad4edc728         48  std::_Sp_counted_ptr<>::_M_dispose()
    @     0x7fcad4dd8c2d        128  std::_Sp_counted_base<>::_M_release()
    @     0x7fcad4dcf3fb         32  std::__shared_count<>::~__shared_count()
    @     0x7fcad4ed605c         32  std::__shared_ptr<>::~__shared_ptr()
    @     0x7fcad4ed6078         32  std::shared_ptr<>::~shared_ptr()
    @     0x7fcad4ec7f23        432  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
    @     0x7fcad4ec7f86         32  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
    @     0x7fcad4e4bc9f         32  std::default_delete<>::operator()()
    @     0x7fcad4e4810a         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4e46453        448  rr::RoadRunnerImpl::~RoadRunnerImpl()
    @     0x7fcad4e15dce         48  rr::RoadRunner::~RoadRunner()
    @     0x7fcad4e15dfa         32  rr::RoadRunner::~RoadRunner()
    @     0x7fcad4d7c962        112  _wrap_delete_RoadRunner
    @     0x7fcad4d498c0        144  SwigPyObject_dealloc
    @           0x532b95  (unknown)  (unknown)
    @           0x8feca0  (unknown)  (unknown)
[2022-02-15 11:14:30,802 E 2063541 2063541] logging.cc:317: *** SIGSEGV received at time=1644920070 on cpu 10 ***
[2022-02-15 11:14:30,802 E 2063541 2063541] logging.cc:317: PC: @     0x7fcad4f2851e  (unknown)  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcb388d93c0  1075403472  (unknown)
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f26bde         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f6c636         32  rrllvm::Jit::~Jit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f80250         32  rrllvm::MCJit::~MCJit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f8026c         32  rrllvm::MCJit::~MCJit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4efcb16         32  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4efc2be         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f45f90        848  rrllvm::ModelResources::~ModelResources()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4edc728         48  std::_Sp_counted_ptr<>::_M_dispose()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4dd8c2d        128  std::_Sp_counted_base<>::_M_release()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4dcf3fb         32  std::__shared_count<>::~__shared_count()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ed605c         32  std::__shared_ptr<>::~__shared_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ed6078         32  std::shared_ptr<>::~shared_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ec7f23        432  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ec7f86         32  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e4bc9f         32  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e4810a         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e46453        448  rr::RoadRunnerImpl::~RoadRunnerImpl()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e15dce         48  rr::RoadRunner::~RoadRunner()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e15dfa         32  rr::RoadRunner::~RoadRunner()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4d7c962        112  _wrap_delete_RoadRunner
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4d498c0        144  SwigPyObject_dealloc
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @           0x532b95  (unknown)  (unknown)
[2022-02-15 11:14:30,804 E 2063541 2063541] logging.cc:317:     @           0x8feca0  (unknown)  (unknown)
Fatal Python error: Segmentation fault

I am pretty sure it is code related to the following: https://github.com/sys-bio/roadrunner/issues/925 (merged beginning of January), i.e. the internal parallelization. Please, please provide an roadrunner without any internal parallization, i.e. a single python thread on a single core! This will create issues in any multiprocessing on clusters. The current libroadrunner=2.2.0 is not working for me at all. This is a big issue, because it breaks the scripts of all my students at the moment.

see https://github.com/sys-bio/roadrunner/issues/963