Open RandomDefaultUser opened 1 year ago
Sounds interesting, I'll have a look
Hey Lenz @RandomDefaultUser,
this file is not part of the default sample data, right? If so, can you share it with me somehow, please?
# Trigger LAMMPS by performing inference on an atomic snapshot.
parameters, network, data_handler, predictor = mala.Predictor.\
load_run("be_model", path="basic")
I replaced that line with something else from the test that looked similar, resulting in:
#!/usr/bin/env python
import os
from ase.io import read
import mala
from mala.datahandling.data_repo import data_repo_path
data_path = os.path.join(data_repo_path, "Be2")
# Trigger LAMMPS by performing inference on an atomic snapshot.
parameters, network, data_handler, predictor = mala.Predictor.load_run(
"workflow_test", path=os.path.join(data_repo_path, "workflow_test")
)
parameters.targets.target_type = "LDOS"
parameters.targets.ldos_gridsize = 11
parameters.targets.ldos_gridspacing_ev = 2.5
parameters.targets.ldos_gridoffset_ev = -5
parameters.running.inference_data_grid = [18, 18, 27]
parameters.descriptors.descriptor_type = "Bispectrum"
parameters.descriptors.bispectrum_twojmax = 10
parameters.descriptors.bispectrum_cutoff = 4.67637
parameters.targets.pseudopotential_path = data_path
predicted_ldos = predictor. \
predict_from_qeout(os.path.join(data_path,
"Be_snapshot3.out"))
ldos_calculator: mala.LDOS
ldos_calculator = data_handler.target_calculator
ldos_calculator. \
read_additional_calculation_data(os.path.join(data_path,
"Be_snapshot3.out"),
"espresso-out")
ldos_calculator.read_from_array(predicted_ldos)
# total_energy_traditional = ldos_calculator.total_energy
# parameters.descriptors.use_atomic_density_energy_formula = True
ldos_calculator.read_from_array(predicted_ldos)
# Test OpenPMD.
params = mala.Parameters()
ldos_calculator = mala.LDOS.from_numpy_file(
params, os.path.join(data_path, "Be_snapshot1.out.npy")
)
ldos_calculator.read_additional_calculation_data(
os.path.join(data_path, "Be_snapshot1.out"), "espresso-out"
)
# Write and then read in via OpenPMD and make sure all the info is
# retained.
ldos_calculator.write_to_openpmd_file(
"test_openpmd.h5", ldos_calculator.local_density_of_states
)
This runs without problems for me.
A bug like this might depend on the specific setup that you are using, can you please tell me:
What's a bit weird: According to your backtrace, the error occurs very early during construction of the Series
object, before any IO access is made. Apparently, the error occurs inside the C++ standard library during compilation of a Regex that we use for parsing:
/home/fiedlerl/.local/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so(std::__detail::_Compiler<std::regex_traits<char> >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type)+0x735)[0x7f51a0852805]
Some weirdness seems to be going on in the linker, for some reason the openPMD shared library resolves C++ STL symbols in the Lammps shared library. When compiling a GPU-aware Lammps, this is likely to lead to ABI incompatibilities.
#0 0x00007ffe10eb0cd0 in __cxa_throw () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#1 0x00007ffe0f7e3365 in __cxa_bad_cast () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#2 0x00007ffe10edbf00 in std::__cxx11::collate<char> const& std::use_facet<std::__cxx11::collate<char> >(std::locale const&) () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#3 0x00007ffe10e7f97c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__cxx11::regex_traits<char>::transform<char*>(char*, char*) const () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#4 0x00007ffe10e7f718 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__cxx11::regex_traits<char>::transform_primary<char const*>(char const*, char const*) const ()
from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#5 0x00007ffe10e7f58f in std::__detail::_BracketMatcher<std::__cxx11::regex_traits<char>, false, false>::_M_apply(char, std::integral_constant<bool, false>) const::{lambda()#1}::operator()() const () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#6 0x00007ffe10e7eb1b in std::__detail::_BracketMatcher<std::__cxx11::regex_traits<char>, false, false>::_M_ready() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#7 0x00007ffe10e8216d in void std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_insert_bracket_matcher<false, false>(bool) () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#8 0x00007ffe10e7df78 in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_bracket_expression() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#9 0x00007ffe10e7a756 in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_atom() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#10 0x00007ffe10e79b0b in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#11 0x00007ffe10e79bba in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#12 0x00007ffe10e779f4 in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_disjunction() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#13 0x00007ffe10e7a9fc in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_atom() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#14 0x00007ffe10e79b0b in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#15 0x00007ffe10e79bba in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#16 0x00007ffe10e79bba in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#17 0x00007ffe10e779f4 in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_disjunction() () from /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so
#18 0x00007ffe0a15e9dd in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) ()
from /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so
#19 0x00007ffe0a133f7f in openPMD::Series::parseInput(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) ()
from /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so
#20 0x00007ffe0a13da44 in openPMD::Series::Series(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, openPMD::Access, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
from /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so
#21 0x00007ffe0a0d03b8 in ?? () from /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so
#22 0x00007ffe09fdd5b0 in ?? () from /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so
#23 0x00007ffff7cf3f63 in cfunction_call () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#24 0x00007ffff7c87c84 in _PyObject_MakeTpCall () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#25 0x00007ffff7ce9d12 in method_vectorcall () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#26 0x00007ffff7c99a08 in PyVectorcall_Call () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#27 0x00007ffff7d391c2 in slot_tp_init () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#28 0x00007ffff7cea1c7 in type_call () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#29 0x00007ffe2b7f8fb7 in pybind11_meta_call () from /nix/store/15nzi2f67rg8nbxlgdws68kcjyqgnhlg-python3.10-torch-1.12.1/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#30 0x00007ffff7c87c84 in _PyObject_MakeTpCall () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#31 0x00007ffff7c39f69 in _PyEval_EvalFrameDefault () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#32 0x00007ffff7db327f in _PyEval_Vector () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#33 0x00007ffff7ce9cd8 in method_vectorcall () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#34 0x00007ffff7c38344 in _PyEval_EvalFrameDefault () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#35 0x00007ffff7db327f in _PyEval_Vector () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#36 0x00007ffff7c395b9 in _PyEval_EvalFrameDefault () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#37 0x00007ffff7db327f in _PyEval_Vector () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#38 0x00007ffff7db38e8 in PyEval_EvalCode () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#39 0x00007ffff7e3aa9d in run_mod () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#40 0x00007ffff7e47572 in _PyRun_SimpleFileObject () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#41 0x00007ffff7e47b4b in _PyRun_AnyFileObject () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#42 0x00007ffff7e4bd0f in Py_RunMain () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#43 0x00007ffff7e4c535 in Py_BytesMain () from /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
#44 0x00007ffff78af24e in __libc_start_call_main () from /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libc.so.6
I honestly have no idea how this even happens. For now, a workaround is just adding import openpmd_api
at the start of the file, so the linker knows about openPMD from the start. I'll try to figure out how this happened.
Thanks for the investigation! I am glad that the error is reproducible, that helps a lot. At least now we know where to look...
Hi, is it possible that some components (lammps, openPMD-api) are not built with the same compilers / stdlibs?
I see that lammps.so was built with nix while openPMD-api came from which source? Can you try building both with the same toolchain?
I suspect that something in the lammps build exposes or overwrites symbols of the stdlib or some other incompatibility in build toolchains is going on.
Thank you for looking at this, Axel!
I built both openPMD and Lammps with Nix and their dependencies should be compatible.
The dynamically linked dependencies are:
> ldd /nix/store/f0vy6p4m96j21s9fg2ywd28d5d3wdini-python3.10-openPMD-api-0.15.1/lib/python3.10/site-packages/openpmd_api/openpmd_api_cxx.cpython-310-x86_64-linux-gnu.so | sort
/nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fc6000)
libadios2_atl.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_atl.so.2 (0x00007ffff5855000)
libadios2_core.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core.so.2 (0x00007ffff63ae000)
libadios2_core_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core_mpi.so.2 (0x00007ffff6b5f000)
libadios2_cxx11.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11.so.2 (0x00007ffff7434000)
libadios2_cxx11_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11_mpi.so.2 (0x00007ffff75c2000)
libadios2_dill.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_dill.so.2 (0x00007ffff5100000)
libadios2_evpath.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_evpath.so (0x00007ffff58cd000)
libadios2_ffs.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_ffs.so.2 (0x00007ffff5864000)
libadios2_perfstubs.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_perfstubs.so (0x00007ffff5b01000)
libatomic.so.1 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libatomic.so.1 (0x00007ffff5153000)
libbfd-2.39.so => /nix/store/7c8vx9wngib658cfx5pnnfi370a37ppm-libbfd-2.39/lib/libbfd-2.39.so (0x00007ffff5160000)
libblosc2.so.2 => /nix/store/nagq9kg0b6m2yrxn30v15pz5sa44w3f1-blosc2-v2.4.3/lib/libblosc2.so.2 (0x00007ffff595c000)
libbz2.so.1 => /nix/store/61rpfcaxhyqfmnk5qp4z7hf20wh9zgrk-bzip2-1.0.8/lib/libbz2.so.1 (0x00007ffff5947000)
libc.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libc.so.6 (0x00007ffff6be6000)
libdl.so.2 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libdl.so.2 (0x00007ffff6bdf000)
libevent_core-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_core-2.1.so.7 (0x00007ffff5b0e000)
libevent_pthreads-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_pthreads-2.1.so.7 (0x00007ffff5b07000)
libfabric.so.1 => /nix/store/jv6kda0z8m9kw5kvs8inhdgxwasp431f-libfabric-1.15.1/lib/libfabric.so.1 (0x00007ffff5daf000)
libgcc_s.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libgcc_s.so.1 (0x00007ffff6def000)
libhdf5.so.100.1.0 => /nix/store/skqp7rnc98qyslxg8231s8yhg4p8483w-hdf5-1.10.1/lib/libhdf5.so.100.1.0 (0x00007ffff75cb000)
libhwloc.so.15 => /nix/store/jwbh8kj703ns9p7cdcsxg2kl1ggaw7va-hwloc-2.8.0-lib/lib/libhwloc.so.15 (0x00007ffff5b45000)
libibverbs.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/libibverbs.so.1 (0x00007ffff5d6e000)
libm.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libm.so.6 (0x00007ffff6e09000)
libmpi.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libmpi.so.40 (0x00007ffff70ff000)
libnl-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-3.so.200 (0x00007ffff5c54000)
libnl-route-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-route-3.so.200 (0x00007ffff5bc1000)
libnuma.so.1 => /nix/store/94kqdwqz1qdlcv5y07hsrs0z1a5dgqpd-numactl-2.0.16/lib/libnuma.so.1 (0x00007ffff5840000)
libopen-pal.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-pal.so.40 (0x00007ffff60d6000)
libopen-rte.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-rte.so.40 (0x00007ffff621a000)
libpmix.so.2 => /nix/store/f80qm7xlg6q4rh9hd35rxll6vhxk3qvb-pmix-3.2.3/lib/libpmix.so.2 (0x00007ffff5c78000)
libpsm2.so.2 => /nix/store/9hj5fhj0fpfxcsiyyh36c1jz2bh6ab2p-libpsm2-11.2.229/lib/libpsm2.so.2 (0x00007ffff6342000)
libpthread.so.0 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libpthread.so.0 (0x00007ffff583b000)
librdmacm.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/librdmacm.so.1 (0x00007ffff5d8f000)
librt.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/librt.so.1 (0x00007ffff584e000)
libstdc++.so.6 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libstdc++.so.6 (0x00007ffff6ee9000)
libucm.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucm.so.0 (0x00007ffff5f2e000)
libucp.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucp.so.0 (0x00007ffff5f97000)
libucs.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucs.so.0 (0x00007ffff5ec1000)
libuct.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libuct.so.0 (0x00007ffff5f4e000)
libz.so.1 => /nix/store/fblaj5ywkgphzpp5kx41av32kls9256y-zlib-1.2.13/lib/libz.so.1 (0x00007ffff5ba3000)
linux-vdso.so.1 (0x00007ffff7fc5000)
> ldd /nix/store/maynhrzavj8xzyxrr7i22xf47jxq22g8-Lammps-8Feb2023/lib/liblammps.so | sort
/nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fc6000)
libatomic.so.1 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libatomic.so.1 (0x00007fffedfbc000)
libbfd-2.39.so => /nix/store/7c8vx9wngib658cfx5pnnfi370a37ppm-libbfd-2.39/lib/libbfd-2.39.so (0x00007fffedfc9000)
libc.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libc.so.6 (0x00007fffeef17000)
libcrypt.so.1 => /nix/store/9r9v2agfvn1zaifqjwyi9db67p48z0sd-libxcrypt-4.4.30/lib/libcrypt.so.1 (0x00007fffef4ec000)
libcuda.so.1 => /.singularity.d/libs/libcuda.so.1 (0x00007fffef54d000)
libcudart.so.11.0 => /nix/store/cfwcn5kvvcg2j13hvf9cv7siwvkjgvni-cudatoolkit-11.7.0-lib/lib/libcudart.so.11.0 (0x00007fffef200000)
libdl.so.2 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libdl.so.2 (0x00007fffef548000)
libevent_core-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_core-2.1.so.7 (0x00007fffee6b9000)
libevent_pthreads-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_pthreads-2.1.so.7 (0x00007fffee6b2000)
libfabric.so.1 => /nix/store/jv6kda0z8m9kw5kvs8inhdgxwasp431f-libfabric-1.15.1/lib/libfabric.so.1 (0x00007fffee93a000)
libgcc_s.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libgcc_s.so.1 (0x00007fffef527000)
libhwloc.so.15 => /nix/store/jwbh8kj703ns9p7cdcsxg2kl1ggaw7va-hwloc-2.8.0-lib/lib/libhwloc.so.15 (0x00007fffee6f0000)
libibverbs.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/libibverbs.so.1 (0x00007fffee919000)
libm.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libm.so.6 (0x00007fffef120000)
libmpi.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libmpi.so.40 (0x00007ffff131b000)
libnl-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-3.so.200 (0x00007fffee7ff000)
libnl-route-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-route-3.so.200 (0x00007fffee76c000)
libnuma.so.1 => /nix/store/94kqdwqz1qdlcv5y07hsrs0z1a5dgqpd-numactl-2.0.16/lib/libnuma.so.1 (0x00007fffee6a4000)
libomp.so => /nix/store/srddjzm4hdvyiw0k7il4j65mimcfs4a4-openmp-11.1.0/lib/libomp.so (0x00007ffff1235000)
libopen-pal.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-pal.so.40 (0x00007fffeec41000)
libopen-rte.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-rte.so.40 (0x00007fffeed85000)
libpmix.so.2 => /nix/store/f80qm7xlg6q4rh9hd35rxll6vhxk3qvb-pmix-3.2.3/lib/libpmix.so.2 (0x00007fffee825000)
libpsm2.so.2 => /nix/store/9hj5fhj0fpfxcsiyyh36c1jz2bh6ab2p-libpsm2-11.2.229/lib/libpsm2.so.2 (0x00007fffeeead000)
libpthread.so.0 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libpthread.so.0 (0x00007ffff122e000)
libpython3.10.so.1.0 => /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0 (0x00007ffff164e000)
librdmacm.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/librdmacm.so.1 (0x00007fffef4aa000)
librt.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/librt.so.1 (0x00007fffef543000)
libucm.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucm.so.0 (0x00007fffef4ca000)
libucp.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucp.so.0 (0x00007fffeeb02000)
libucs.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucs.so.0 (0x00007fffeea4c000)
libuct.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libuct.so.0 (0x00007fffeeab9000)
libz.so.1 => /nix/store/fblaj5ywkgphzpp5kx41av32kls9256y-zlib-1.2.13/lib/libz.so.1 (0x00007fffee74e000)
linux-vdso.so.1 (0x00007ffff7fc5000)
Here is the diff of both shared objects' dependencies:
> libadios2_atl.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_atl.so.2
> libadios2_core.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core.so.2
> libadios2_core_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core_mpi.so.2
> libadios2_cxx11.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11.so.2
> libadios2_cxx11_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11_mpi.so.2
> libadios2_dill.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_dill.so.2
> libadios2_evpath.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_evpath.so
> libadios2_ffs.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_ffs.so.2
> libadios2_perfstubs.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_perfstubs.so
3a13,14
> libblosc2.so.2 => /nix/store/nagq9kg0b6m2yrxn30v15pz5sa44w3f1-blosc2-v2.4.3/lib/libblosc2.so.2
> libbz2.so.1 => /nix/store/61rpfcaxhyqfmnk5qp4z7hf20wh9zgrk-bzip2-1.0.8/lib/libbz2.so.1
5,7d15
< libcrypt.so.1 => /nix/store/9r9v2agfvn1zaifqjwyi9db67p48z0sd-libxcrypt-4.4.30/lib/libcrypt.so.1
< libcuda.so.1 => /.singularity.d/libs/libcuda.so.1
< libcudart.so.11.0 => /nix/store/cfwcn5kvvcg2j13hvf9cv7siwvkjgvni-cudatoolkit-11.7.0-lib/lib/libcudart.so.11.0
12a21
> libhdf5.so.100.1.0 => /nix/store/skqp7rnc98qyslxg8231s8yhg4p8483w-hdf5-1.10.1/lib/libhdf5.so.100.1.0
20d28
< libomp.so => /nix/store/srddjzm4hdvyiw0k7il4j65mimcfs4a4-openmp-11.1.0/lib/libomp.so
26d33
< libpython3.10.so.1.0 => /nix/store/5axq6aw8j3vcs2m7gi440cwpcckl7ql9-python3-3.10.9/lib/libpython3.10.so.1.0
28a36
> libstdc++.so.6 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libstdc++.so.6
However, Lammps is built with nvcc+gcc11.3.0 while openPMD is directly built with gcc11.3.0.
What seems weird is that Lammps does not link to libstdc++.so.6 at all, but somehow still carries its symbols.
I suspect that something in the lammps build exposes or overwrites symbols of the stdlib or some other incompatibility in build toolchains is going on.
So we should probably ask the Lammps developers if their code does anything that could be causing this?
I have tried looking into this once more, and I think that I have found out what caused the issue on my end. Since the symptoms on your end seem to be the same, it's likely that we're looking at the same thing here.
In the failing build environment, I had built Lammps with NVCC, but my Kokkos build was with Clang (I had had issues with a gcc build and picked Clang as an alternative). So, openPMD-api and Lammps were referring to two different C++ standard libraries that are ABI-incompatible, but use the same symbols. Since one symbol cannot exist twice in the same application context, whoever loads his symbols first, gets the first shot. Hence the error being suppressible by adding an early import openpmd_api
.
I tested my environment from back then again and can still reproduce the issue. After setting up a new environment that builds Kokkos and openPMD-api both with the same software stack (gcc+nvcc / gcc), the script runs fine without an error.
TLDR: This is likely not a bug, but rather a wrong software environment with incompatible dependencies. Do you still know how you had set up your environment for this bug to occur? @RandomDefaultUser Would also be interesting to see if adding a import openpmd_api
can suppress this issue for you as well.
When investigating a problem with the test pipeline, I stumbled upon the fact that attempting an OpenPMD write after LAMMPS has been used in any capacity will result in crash. A MWE to reproduce this problem (assuming the model from the basic examples is present) is:
This results in
For good measure one may through in a
mala.finalize()
before the OpenPMD part, which calls thelammps.finalize()
function - but this does not affect the error in any way.