Cantera / cantera

Chemical kinetics, thermodynamics, and transport tool suite
https://cantera.org
Other
581 stars 342 forks source link

Build from source on Power PC 9: linking problem and segmentation error #1627

Closed CharlelieLrt closed 9 months ago

CharlelieLrt commented 9 months ago

Problem description

Cannot build cantera from source on Power PC 9 architecture using either conda or cantera's submodules. Getting either an undefined symbol from a linked library, or a segmentation error.

Steps to reproduce

I am attempting to build cantera from source on a power PC 9 system. I am working in a conda environment, created with the following config:

name: ct-env
channels:
  - defaults
  - conda-forge
dependencies:
- python  # Cantera supports Python 3.8 and up
- scons  # build system
- boost-cpp  # C++ dependency
- hdf5  # optional C++ dependency
- highfive  # C++ dependency; uncomment to override Cantera default
- sundials  # uncomment to override Cantera default
- fmt  # uncomment to override Cantera default
- eigen  # uncomment to override Cantera default
- yaml-cpp  # uncomment to override Cantera default
- libgomp  # optional (OpenMP implementation when using GCC)
- cython  # needed to build Python package
- numpy  # needed to build Python package
- pip  # needed to build Python package
- wheel  # needed to build Python package
- setuptools  # needed to build Python package
- pytest  # needed for the Python test suite
- ruamel.yaml  # needed for converter scripts
- scipy  # optional (needed for some examples)
- matplotlib  # optional (needed for plots)

Cantera 3.0 is built from source using the following scons config:

f90_interface = 'n'
system_eigen = 'y'
system_fmt = 'y'
hdf_support = 'n'
system_highfive = 'y'
system_yamlcpp = 'y'
system_sundials = 'y'
system_blas_lapack = 'y'
extra_lib_dirs = '$CONDA_PREFIX/lib'

Note that this is using conda's provided packages for yaml-cpp, fmt, eigen, sundials, blas and lapack.

After the built completes, running import cantera in python gives an undefined symbol error related to YAML::Emitter::Write(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) from the library .conda/envs/kinetics/lib/libyaml-cpp.so.0.7. I checked and this version of yaml-cpp (v0.7) does not define this symbol. Older versions of yam-cpp (e.g. v5.0) do define the missing symbol.

I therefore decided to use cantera's Git submodule for yaml-cpp, instead of the one provided by conda. After doing so, I am getting a new undefined symbol, but this time related to the symbol fmt::v9::vformat(fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::appender, char> >) from the library .conda/envs/kinetics/lib/libfmt.so.9. Once again I checked and fmt-cpp v9 does not define this symbol.

So I once again decided to use cantera's submodule for fmt-cpp instead of the one provided by conda. After rebuilding cantera, I am able to run import cantera from a python script. However, running a python test gives me a segmentation error:

0x0000200815c3e10c in std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem::path::_Cmpt> >::~vector (this=0x38, __in_chrg=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:565
565       ~vector() _GLIBCXX_NOEXCEPT

System information

speth commented 9 months ago

I notice in your cantera.conf you have extra_lib_dirs = '$CONDA_PREFIX/lib'. But do you also have extra_inc_dirs = '$CONDA_PREFIX/include'? If not, I'm not sure where the compilation process is finding the headers, and mixing headers from one location with libraries for another would certainly explain segfaults.

Searching for the type mentioned in the segfault, std::filesystem::path::_Cmpt leads me to the following StackOverflow post, which seems very relevant: https://stackoverflow.com/questions/56615841/passing-stdfilesystempath-to-a-function-segfaults -- it seems that G++ 8.x uses a separate library to provide std::filesystem, a quirk that has been removed in more recent versions. Our testing is limited to GCC 9 and newer, as noted in the build requirements.

CharlelieLrt commented 9 months ago

@speth thank you for the suggestions.

So, I made sure to include a newer version of gcc in my conda environment. I am now using gcc (Anaconda gcc) 11.2.0. In addition I made sure to add extra_inc_dirs = '$CONDA_PREFIX/include' to cantera.conf. So I am now using Cantera's submodules for yaml-ccp and fmt, and conda packages for all other dependencies.

However this did not solve the segmentation error. When running a simple reactor test case I am getting the same error:

Program received signal SIGSEGV, Segmentation fault.
0x0000200000d346ec in std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem::path::_Cmpt> >::~vector (this=0x38, __in_chrg=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:565
565       ~vector() _GLIBCXX_NOEXCEPT

Something strange is that it is still using the stl_vector.h header from my system's gcc install, instead of my conda's gcc. Actually it is not that strange, as my $CONDA_PREFIX/include does not contain the header stl_vector.h. Is there another dependency missing from my conda environmentt?

Here is a full backtrace if that is of any use:

#0  0x0000200000d346ec in std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem::path::_Cmpt> >::~vector (this=0x38, __in_chrg=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:565
#1  0x0000200000d34728 in ~path (this=0x30, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:565
#2  ~_Cmpt (this=0x30, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/fs_path.h:643
#3  _Destroy<std::filesystem::path::_Cmpt> (__pointer=0x30) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:98
#4  __destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=0x30) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:108
#5  _Destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:137
#6  _Destroy<std::filesystem::path::_Cmpt*, std::filesystem::path::_Cmpt> (__last=0x9d1, __first=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:206
#7  std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem::path::_Cmpt> >::~vector (this=0x106e4410, __in_chrg=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:567
#8  0x0000200000d34728 in ~path (this=0x106e4408, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:565
#9  ~_Cmpt (this=0x106e4408, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/fs_path.h:643
#10 _Destroy<std::filesystem::path::_Cmpt> (__pointer=0x106e4408) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:98
#11 __destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=0x106e4408) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:108
#12 _Destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:137
#13 _Destroy<std::filesystem::path::_Cmpt*, std::filesystem::path::_Cmpt> (__last=0x3, __first=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:206
#14 std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem::path::_Cmpt> >::~vector (this=0x10c7ba38, __in_chrg=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:567
#15 0x0000200000d30724 in ~path (this=0x10c7ba30, __in_chrg=<optimized out>) at src/base/AnyMap.cpp:1789
#16 ~_Cmpt (this=0x10c7ba30, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/fs_path.h:643
#17 _Destroy<std::filesystem::path::_Cmpt> (__pointer=0x10c7ba30) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:98
#18 __destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=0x10c7ba30) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:108
#19 _Destroy<std::filesystem::path::_Cmpt*> (__last=<optimized out>, __first=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:137
#20 _Destroy<std::filesystem::path::_Cmpt*, std::filesystem::path::_Cmpt> (__last=0x0, __first=<optimized out>)
    at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_construct.h:206
#21 ~vector (this=0x7fffffffac08, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/stl_vector.h:567
#22 ~path (this=0x7fffffffac00, __in_chrg=<optimized out>) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/fs_path.h:208
#23 Cantera::AnyMap::fromYamlFile (name=..., parent_name=...) at src/base/AnyMap.cpp:1789
#24 0x0000200000d8e740 in Cantera::newSolution (infile=..., name=..., transport=..., adjacent=...) at src/base/Solution.cpp:200
#25 0x000020000098d2c0 in __pyx_pf_7cantera_12solutionbase_13_SolutionBase_8_init_yaml (__pyx_v_transport=0x105bedb8 <_PyRuntime+22304>, __pyx_v_source=<optimized out>, 
    __pyx_v_adjacent=<optimized out>, __pyx_v_name=<optimized out>, __pyx_v_infile=<optimized out>, __pyx_v_self=<optimized out>)
    at build/python/cantera/solutionbase.cpp:10134
#26 __pyx_pw_7cantera_12solutionbase_13_SolutionBase_9_init_yaml (__pyx_v_self=<optimized out>, __pyx_args=<optimized out>, __pyx_nargs=<optimized out>, 
    __pyx_kwds=<optimized out>) at build/python/cantera/solutionbase.cpp:9778
#27 0x000020000096f814 in __Pyx_CyFunction_Vectorcall_FASTCALL_KEYWORDS (func=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>)
    at build/python/cantera/solutionbase.cpp:25478
#28 0x0000200000972fac in __Pyx_PyObject_FastCallDict (func=0x200815522810, args=0x7fffffffb7b0, _nargs=<optimized out>, kwargs=0x0)
    at build/python/cantera/solutionbase.cpp:23497
#29 0x0000200000984fe0 in __pyx_pf_7cantera_12solutionbase_13_SolutionBase_2_cinit (__pyx_v_kwargs=0x2008155d5d80, __pyx_v_reactions=<optimized out>, 
    __pyx_v_kinetics=<optimized out>, __pyx_v_species=<optimized out>, __pyx_v_thermo=<optimized out>, __pyx_v_yaml=<optimized out>, __pyx_v_origin=<optimized out>, 
    __pyx_v_adjacent=0x105c7cb0 <_PyRuntime+58904>, __pyx_v_name=0x105bedb8 <_PyRuntime+22304>, __pyx_v_infile=0x2000004d6430, __pyx_v_self=0x2008155796f0)
    at build/python/cantera/solutionbase.cpp:8955
#30 __pyx_pw_7cantera_12solutionbase_13_SolutionBase_3_cinit (__pyx_v_self=0x2008155796f0, __pyx_args=<optimized out>, __pyx_nargs=<optimized out>, __pyx_kwds=<optimized out>)
    at build/python/cantera/solutionbase.cpp:8540
---Type <return> to continue, or q <return> to quit---
#31 0x000020000096f814 in __Pyx_CyFunction_Vectorcall_FASTCALL_KEYWORDS (func=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>)
    at build/python/cantera/solutionbase.cpp:25478
#32 0x00000000100ae96c in method_vectorcall ()
#33 0x00000000100a94d8 in _PyVectorcall_Call ()
#34 0x000020000096afd4 in __Pyx_PyObject_Call (func=0x2000005df800, arg=0x105c7cb0 <_PyRuntime+58904>, kw=0x200000532d80) at build/python/cantera/solutionbase.cpp:22748
#35 0x0000200000987888 in __pyx_pf_7cantera_12solutionbase_13_SolutionBase___cinit__ (__pyx_v_kwargs=0x20000053f500, __pyx_v_init=<optimized out>, 
    __pyx_v_reactions=<optimized out>, __pyx_v_kinetics=<optimized out>, __pyx_v_species=<optimized out>, __pyx_v_thermo=<optimized out>, __pyx_v_yaml=<optimized out>, 
    __pyx_v_origin=<optimized out>, __pyx_v_adjacent=<optimized out>, __pyx_v_name=<optimized out>, __pyx_v_infile=<optimized out>, __pyx_v_self=0x2008155796f0)
    at build/python/cantera/solutionbase.cpp:8094
#36 __pyx_pw_7cantera_12solutionbase_13_SolutionBase_1__cinit__ (__pyx_kwds=<optimized out>, __pyx_args=<optimized out>, __pyx_v_self=0x2008155796f0)
    at build/python/cantera/solutionbase.cpp:7965
#37 __pyx_tp_new_7cantera_12solutionbase__SolutionBase (t=<optimized out>, a=<optimized out>, k=<optimized out>) at build/python/cantera/solutionbase.cpp:19490
#38 0x00002000009a1bdc in __pyx_tp_new_7cantera_6thermo_ThermoPhase (t=<optimized out>, a=<optimized out>, k=<optimized out>) at build/python/cantera/thermo.cpp:43048
#39 0x000000001014c100 in type_call ()
#40 0x00000000100a9b64 in _PyObject_MakeTpCall ()
#41 0x00000000100aa6f0 in PyObject_Vectorcall ()
#42 0x0000000010023e5c in _PyEval_EvalFrameDefault ()
#43 0x0000000010212f64 in _PyEval_Vector ()
#44 0x0000000010213058 in PyEval_EvalCode ()
#45 0x0000000010279f20 in run_eval_code_obj ()
#46 0x000000001027a40c in run_mod ()
#47 0x000000001027a5d8 in pyrun_file ()
#48 0x000000001027e9cc in _PyRun_SimpleFileObject ()
#49 0x000000001027f278 in _PyRun_AnyFileObject ()
#50 0x00000000102ae43c in Py_RunMain ()
#51 0x00000000102aec68 in pymain_main ()
#52 0x00000000102aee44 in Py_BytesMain ()
#53 0x000000001001ea28 in main ()
speth commented 9 months ago

How sure are you that it's actually using Anaconda g++ binary (not just gcc - there is a separate gxx package)? I don't think a nonstandard path like /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/ should be on the search path if you are using that. Is /usr/tce part of some module system? Can you unload the GCC 8.3 module?

CharlelieLrt commented 9 months ago

I had indeed a module system that was loading gcc 8.3.1 by default, so I made sure to purge all loaded modules. In addition, I had the gxx package missing from my conda environment, so I added it. These two changes seem to have fixed the problem and I am able to run my cases.

Thanks for your help, I am closing the issue.