spatial-model-editor / spatial-model-editor

Spatial bio-chemical reaction model editor and simulator
https://spatial-model-editor.github.io
MIT License
14 stars 7 forks source link

optimization dune error with threads > 1 #800

Open lkeegan opened 2 years ago

lkeegan commented 2 years ago

Describe the bug

With nthreads > 1, the first simulation on each thread works ok, but the second one crashes. nthreads=1 works.

Looks like a recurrence of a previous issue:

The problem seemed to be that the new dune sim would make use of some objects from the old dune sim (grid?) which were then de-allocated in the first dune sim destructor.

This was fixed by always first resetting the simulator unique_ptr before creating a new dun sim, i.e.

sim.reset();
sim = std::make_unique<DuneSim>(...);

However this doesn't work if we have multiple threads with DuneSim objects.

Example stack trace

    #0 0x556f5fe184f1 in UG::D2::BNDP_Global(void**, double*) std_domain.cc
    #1 0x556f5fe04f4d in UG::D2::InsertBoundaryNode(UG::D2::grid*, void**) ugm.cc
    #2 0x556f5fe03d92 in UG::D2::InsertMesh(UG::D2::multigrid*, UG::D2::mesh*) ugm.cc
    #3 0x556f5fe0396a in UG::D2::CreateMultiGrid(char*, char*, char const*, int, int, std::shared_ptr<PPIF::PPIFContext>) ugm.cc
    #4 0x556f5fe15aff in UG::D2::NewCommand(int, char**, std::shared_ptr<PPIF::PPIFContext>) commands.cc
    #5 0x556f5fd8d993 in Dune::GridFactory<Dune::UGGrid<2>>::createGrid() uggridfactory.cc
    #6 0x556f5e7df15b in std::pair<std::vector<unsigned long, std::allocator<unsigned long>>, std::shared_ptr<Dune::UGGrid<2>>> sme::simulate::detail::makeHostGrid<Dune::UGGrid<2>>(sme::mesh::Mesh const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunegrid.hpp:88:32
    #7 0x556f5e7ddda1 in auto sme::simulate::makeDuneGrid<Dune::UGGrid<2>, Dune::mdgrid::DynamicSubDomainCountTraits<2, 1ul, Dune::mdgrid::AllCodims>>(sme::mesh::Mesh const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunegrid.hpp:126:34
    #8 0x556f5e7dd592 in sme::simulate::DuneImpl::DuneImpl(sme::simulate::DuneConverter const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunesim_impl.cpp:25:30
    #9 0x556f5e31fbee in sme::simulate::DuneImplIndependent<1>::DuneImplIndependent(sme::simulate::DuneConverter const&, sme::simulate::DuneOptions const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunesim_impl_independent.hpp:32:9
    #10 0x556f5e2c669f in std::_MakeUniq<sme::simulate::DuneImplIndependent<1>>::__single_object std::make_unique<sme::simulate::DuneImplIndependent<1>, sme::simulate::DuneConverter&, sme::simulate::DuneOptions const&>(sme::simulate::DuneConverter&, sme::simulate::DuneOptions const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34
    #11 0x556f5e2c58d7 in sme::simulate::DuneSim::DuneSim(sme::model::Model const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, double, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, double>>> const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunesim.cpp:212:19
    #12 0x556f5e290dc7 in std::_MakeUniq<sme::simulate::DuneSim>::__single_object std::make_unique<sme::simulate::DuneSim, sme::model::Model&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>>&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, double, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, double>>>&>(sme::model::Model&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>>&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, double, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, double>>>&) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34
    #13 0x556f5e294087 in sme::simulate::Simulation::Simulation(sme::model::Model&) /home/liam/SSC/spatial-model-editor/core/simulate/src/simulate.cpp:214:9
    #14 0x556f5e264f3e in sme::simulate::PagmoUDP::fitness(std::vector<double, std::allocator<double>> const&) const /home/liam/SSC/spatial-model-editor/core/simulate/src/optimize_impl.cpp:96:29
    #15 0x556f5e25573b in pagmo::detail::prob_inner<sme::simulate::PagmoUDP>::fitness(std::vector<double, std::allocator<double>> const&) const /opt/smelibs/include/pagmo/problem.hpp:525:24
    #16 0x556f60776e1f in pagmo::problem::fitness(std::vector<double, std::allocator<double>> const&) const problem.cpp
    #17 0x556f607bb1ce in pagmo::pso::evolve(pagmo::population) const pso.cpp
    #18 0x556f5e237e45 in pagmo::detail::algo_inner<pagmo::pso>::evolve(pagmo::population const&) const /opt/smelibs/include/pagmo/algorithm.hpp:222:24
    #19 0x556f6076afbc in pagmo::algorithm::evolve(pagmo::population const&) const algorithm.cpp
    #20 0x556f6081780e in pagmo::thread_island::run_evolve(pagmo::island&) const::$_0::operator()() const thread_island.cpp
    #21 0x556f60817b17 in tbb::detail::d1::function_stack_task<pagmo::thread_island::run_evolve(pagmo::island&) const::$_1>::execute(tbb::detail::d1::execution_data&) thread_island.cpp
    #22 0x556f5fe51c1a in tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) task_dispatcher.cpp
    #23 0x556f5fe4fe43 in tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task*, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) task_dispatcher.cpp
    #24 0x556f6081759e in pagmo::thread_island::run_evolve(pagmo::island&) const thread_island.cpp
    #25 0x556f607821f1 in std::enable_if<is_invocable_r_v<void, pagmo::island::evolve(unsigned int)::$_0&>, void>::type std::__invoke_r<void, pagmo::island::evolve(unsigned int)::$_0&>(pagmo::island::evolve(unsigned int)::$_0&) island.cpp
    #26 0x556f607805f0 in std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<pagmo::island::evolve(unsigned int)::$_0, std::allocator<int>, void ()>::_M_run()::'lambda'(), void>>::_M_invoke(std::_Any_data const&) island.cpp
    #27 0x556f5d7b8685 in std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9
    #28 0x556f5d7b7f4f in std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/future:571:27
    #29 0x7f6fdafbcf67 in __pthread_once_slow nptl/./nptl/pthread_once.c:116:7
    #30 0x556f60780330 in std::__future_base::_Task_state<pagmo::island::evolve(unsigned int)::$_0, std::allocator<int>, void ()>::_M_run() island.cpp
    #31 0x556f60840d8c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<pagmo::detail::task_queue::task_queue()::$_0>>>::_M_run() task_queue.cpp
    #32 0x7f6fdb2492c2  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc2c2) (BuildId: 725ef5da52ee6d881f9024d8238a989903932637)
    #33 0x7f6fdafb7b42 in start_thread nptl/./nptl/pthread_create.c:442:8
    #34 0x7f6fdb0499ff  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

To Reproduce Start optimization with dune and threads > 1

lkeegan commented 2 years ago
ERROR in newformat: failed creating the format
ERROR in ConfigureCommand: cannot read BndValProblem specification
ERROR in CreateMultiGrid: format not found
ERROR in new: could not create multigrid
[E][   dunesim.cpp: 227] DuneSim :: GridError [createGrid:/tmp/setup-dune/dune-grid/dune/grid/uggrid/uggridfactory.cc:458]: Calling UG::2d::ConfigureCommand failed!
[E][   dunesim.cpp: 227] DuneSim :: GridError [createGrid:/tmp/setup-dune/dune-grid/dune/grid/uggrid/uggridfactory.cc:476]: UGGrid<2>::makeNewMultigrid failed!
lkeegan commented 2 years ago
ERROR in CreateMultiGrid: BVP not found
ERROR in new: could not create multigrid
lkeegan commented 2 years ago

Can reproduce in a single thread with:

    auto m{getExampleModel(Mod::ABtoC)};
    std::vector<std::string> comps{"comp"};
    std::unique_ptr<simulate::DuneSim> s;
    s = std::make_unique<simulate::DuneSim>(m, comps);
    s = std::make_unique<simulate::DuneSim>(m, comps);
    // this line segfaults:
    s = std::make_unique<simulate::DuneSim>(m, comps);

Trace:

==990160==The signal is caused by a READ memory access.                                                                                                                                            
    #0 0x559dce0b11d0 in UG::D2::BNDS_Dispose(UG::HEAP*, void**) std_domain.cc
    #1 0x559dce09a6d8 in UG::D2::DisposeElement(UG::D2::grid*, UG::D2::element*, int) ugm.cc
    #2 0x559dce09cf58 in UG::D2::DisposeGrid(UG::D2::grid*) ugm.cc
    #3 0x559dce09bd5c in UG::D2::DisposeMultiGrid(UG::D2::multigrid*) ugm.cc
    #4 0x559dce0addb8 in UG::D2::NewCommand(int, char**, std::shared_ptr<PPIF::PPIFContext>) commands.cc
    #5 0x559dce025ba3 in Dune::GridFactory<Dune::UGGrid<2>>::createGrid() uggridfactory.cc
    #6 0x559dcc77a0bb in std::pair<std::vector<unsigned long, std::allocator<unsigned long>>, std::unique_ptr<Dune::UGGrid<2>, std::default_delete<Dune::UGGrid<2>>>> sme::simulate::detail::makeHostGrid<Dune::UGGrid<2>>(sme::mesh::Mesh const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunegrid.hpp:88:32
    #7 0x559dcc775afe in auto sme::simulate::makeDuneGrid<Dune::UGGrid<2>, Dune::mdgrid::DynamicSubDomainCountTraits<2, 1ul, Dune::mdgrid::AllCodims>>(sme::mesh::Mesh const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunegrid.hpp:126:34
    #8 0x559dcdf08287 in sme::simulate::DuneImpl::DuneImpl(sme::simulate::DuneConverter const&) /home/liam/SSC/spatial-model-editor/core/simulate/src/dunesim_impl.cpp:25:30
SoilRos commented 2 years ago

I am actually not sure if its possible to have different instances of UGGrid at the same time. What I know is that one grid can be traversed with multiple threads. In fact, next release of dune-copasi will be multi-threaded on the assembly part.