ginkgo-project / ginkgo

Numerical linear algebra software package
https://ginkgo-project.github.io/
BSD 3-Clause "New" or "Revised" License
400 stars 89 forks source link

segfault in jacobi_kernels test with Ginkgo 1.3.0 built with GCC 10.2.0 #732

Closed boegel closed 3 years ago

boegel commented 3 years ago

I'm seeing failing tests when building Ginkgo 1.3.0 with GCC 10.2.0 using EasyBuild, on various systems.

I've briefly discussed this with @upsj, who asked me to open an issue with all details included.

Failing test, when building with CMake 3.18.4 + make on top of CUDA 11.1.1, then running the tests with make test:

$ make test
...
The following tests FAILED:                                                                                                                                                                                     
    152 - omp/test/preconditioner/jacobi_kernels (SEGFAULT)
Errors while running CTest                                                                                                                                                                                  
make: *** [test] Error 8

A very similar problem happens on another system (RHEL 8.2, AMD Rome, AMD EPYC 7552), without CUDA, when using CMake 3.18.4 + Ninja 1.10.1, same compiler options:

$ ninja test
...
The following tests FAILED:
    116 - omp/test/preconditioner/jacobi_kernels (SEGFAULT)
    118 - omp/test/solver/bicg_kernels (Failed)
Errors while running CTest

Some hopefully useful info collected with GDB:

$ gdb omp/test/preconditioner/jacobi_kernels
(gdb) run
...
[==========] Running 39 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 39 tests from Jacobi
[ RUN      ] Jacobi.OmpFindNaturalBlocksEquivalentToRef
[       OK ] Jacobi.OmpFindNaturalBlocksEquivalentToRef (54 ms)
[ RUN      ] Jacobi.OmpExecutesSupervariableAgglomerationEquivalentToRef
[       OK ] Jacobi.OmpExecutesSupervariableAgglomerationEquivalentToRef (12 ms)
[ RUN      ] Jacobi.OmpFindNaturalBlocksInLargeMatrixEquivalentToRef

Thread 13 "jacobi_kernels" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x155551f5e700 (LWP 1022726)]
0x00001555554ead13 in void gko::kernels::omp::jacobi::generate<double, int>(std::shared_ptr<gko::OmpExecutor const>, gko::matrix::Csr<double, int> const*, unsigned long, unsigned int, gko::detail::remove_complex_impl<double>::type, gko::preconditioner::block_interleaved_storage_scheme<int> const&, gko::Array<gko::detail::remove_complex_impl<double>::type>&, gko::Array<gko::precision_reduction>&, gko::Array<int> const&, gko::Array<double>&) [clone ._omp_fn.0] () from /tmp/slurm.50117758.0/tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/omp/libginkgo_omp.so.1.3.0
(gdb) bt
#0  0x00001555554ead13 in void gko::kernels::omp::jacobi::generate<double, int>(std::shared_ptr<gko::OmpExecutor const>, gko::matrix::Csr<double, int> const*, unsigned long, unsigned int, gko::detail::remove_complex_impl<double>::type, gko::preconditioner::block_interleaved_storage_scheme<int> const&, gko::Array<gko::detail::remove_complex_impl<double>::type>&, gko::Array<gko::precision_reduction>&, gko::Array<int> const&, gko::Array<double>&) [clone ._omp_fn.0] () from /tmp/slurm.50117758.0/tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/omp/libginkgo_omp.so.1.3.0
#1  0x00001555539e3036 in gomp_thread_start (xdata=<optimized out>) at ../../../libgomp/team.c:123
#2  0x00001555537b12de in start_thread () from /lib64/libpthread.so.0
#3  0x0000155553b04e83 in clone () from /lib64/libc.so.6
upsj commented 3 years ago

Hi Kenneth, first of all thanks for the detailed error description! I had hoped that the CMake and Ninja build information would shed some light on the issue, but I am still unable to reproduce the issue on both Intel Xeon Gold 6230 (Cascade Lake) and AMD EPYC 7742. I don't see any difference to the build commands output by Ninja, so this is starting to sound more like tiny differences in the compiler ecosystem. Would it be possible for you to check if the issue still occurs if you build static libraries

cmake -GNinja -DBUILD_SHARED_LIBS=OFF ...
ninja omp/test/preconditioner/jacobi_kernels

and send me the generated binary (maybe even with debug symbols -g enabled)? Then hopefully I can try to narrow down where the differences come from, and if I can reproduce it with the binary at least.

boegel commented 3 years ago

@upsj I tried doing a static build, but I'm running into some trouble there, linking errors like cannot find -lgcc_s because of missing static libraries in the GCC installation (which I'm not sure is easy to fix).

I do have a little bit of perhaps useful information though.

first=0x15551c000cf0, last=0x15551c000df0, __alloc=...) at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/alloc_traits.h:728

14 0x00001555533704be in std::vector<gko::Array, gko::ExecutorAllocator<gko::Array > >::~vector (warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x155553381102 in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

this=0x155551b1fb40, __in_chrg=) at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/stl_vector.h:680 warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

15 0x000015555338145c in gko::kernels::omp::jacobi::_ZN3gko7kernels3omp6jacobi8generateIdiEEvSt10shared_ptrIKNS_11OmpExecutorEEPKNS_6matrix3CsrIT_T0_EEmjNS_6detail19remove_complex_implISA_E4typeERKNS_14preconditioner32block_interleaved_storage_schemeISB_EERNS_5ArrayISI_EERNSO_INS_19precision_reductionEEERKNSO_ISB_EERNSO_ISA_EE._omp_fn.0(void) (warning: (Internal error: pc 0x15555338145b in read in psymtab, but not in symtab.)

) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/ginkgo-1.3.0/omp/preconditioner/jacobi_kernels.cpp:376

16 0x0000155555385036 in gomp_thread_start (xdata=) at ../../../libgomp/team.c:123

17 0x00001555523622de in start_thread () from /lib64/libpthread.so.0

18 0x0000155552675e83 in clone () from /lib64/libc.so.6


And with `OMP_NUM_THREADS=2`, here's the GDB backtrace with the `-O0 -g` build:

... [ RUN ] Jacobi.OmpConjTransposedPreconditionerEquivalentToRefWithMPW [ OK ] Jacobi.OmpConjTransposedPreconditionerEquivalentToRefWithMPW (31 ms) [ RUN ] Jacobi.OmpApplyEquivalentToRefWithBlockSize32 corrupted double-linked list

Thread 1 "jacobi_kernels" received signal SIGABRT, Aborted. 0x00001555525b170f in raise () from /lib64/libc.so.6 (gdb) bt

0 0x00001555525b170f in raise () from /lib64/libc.so.6

1 0x000015555259bb25 in abort () from /lib64/libc.so.6

2 0x00001555525f4897 in __libc_message () from /lib64/libc.so.6

3 0x00001555525fafdc in malloc_printerr () from /lib64/libc.so.6

4 0x00001555525fb81c in unlink_chunk.isra () from /lib64/libc.so.6

5 0x00001555525fb953 in malloc_consolidate () from /lib64/libc.so.6

6 0x00001555525fdd58 in _int_malloc () from /lib64/libc.so.6

7 0x00001555525ff662 in malloc () from /lib64/libc.so.6

8 0x0000155552d77755 in operator new (sz=2048) at ../../../../libstdc++-v3/libsupc++/new_op.cc:50

9 0x00000000004b5786 in gnu_cxx::new_allocator<gko::matrix_data<double, int>::nonzero_type>::allocate (this=0x7fffffff1dc0, n=128)

at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/ext/new_allocator.h:115

10 0x00000000004b046c in std::allocator_traits<std::allocator<gko::matrix_data<double, int>::nonzero_type> >::allocate (a=..., n=128)

at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/alloc_traits.h:460

11 0x00000000004a86c8 in std::_Vector_base<gko::matrix_data<double, int>::nonzero_type, std::allocator<gko::matrix_data<double, int>::nonzero_type> >::_M_allocate (this=0x7fffffff1dc0, __n=128)

at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/stl_vector.h:346

12 0x00000000004a4150 in std::vector<gko::matrix_data<double, int>::nonzero_type, std::allocator<gko::matrix_data<double, int>::nonzero_type> >::_M_realloc_insert<unsigned long&, unsigned long&, double>

(this=0x7fffffff1dc0, __position=...) at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/vector.tcc:440

13 0x000000000049bcfe in std::vector<gko::matrix_data<double, int>::nonzero_type, std::allocator<gko::matrix_data<double, int>::nonzero_type> >::emplace_back<unsigned long&, unsigned long&, double> (

this=0x7fffffff1dc0) at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/vector.tcc:121

14 0x0000000000496490 in gko::test::generate_random_matrix<gko::matrix::Csr<double, int>, std::uniform_int_distribution, std::normal_distribution, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&>(unsigned long, unsigned long, std::uniform_int_distribution&&, std::normal_distribution&&, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&, std::shared_ptr)::{lambda(unsigned long)#1}::operator()(unsigned long) const (this=0x7fffffff1d30, col=52)

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/ginkgo-1.3.0/core/test/utils/matrix_generator.hpp:118

15 0x000000000049bd54 in std::for_each<gnu_cxx::normal_iterator<unsigned long*, std::vector<unsigned long, std::allocator > >, gko::test::generate_random_matrix<gko::matrix::Csr<double, int>, std::uniform_int_distribution, std::normal_distribution, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&>(unsigned long, unsigned long, std::uniform_int_distribution&&, std::normal_distribution&&, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&, std::shared_ptr)::{lambda(unsigned long)#1}>(gnu_cxx::normal_iterator<unsigned long, std::vector<unsigned long, std::allocator > >, __gnu_cxx::__normal_iterator<unsigned long, std::vector<unsigned long, std::allocator > >, gko::test::generate_random_matrix<gko::matrix::Csr<double, int>, std::uniform_int_distribution, std::normal_distribution, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&>(unsigned long, unsigned long, std::uniform_int_distribution&&, std::normal_distribution&&, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&, std::shared_ptr)::{lambda(unsigned long)#1}) (first=..., last=...,

__f=...) at /arcanine/scratch/gent/vo/000/gvo00002/vsc40023/easybuild_tests/RHEL8/zen2-ib/software/GCCcore/10.2.0/include/c++/10.2.0/bits/stl_algo.h:3839

16 0x00000000004966d7 in gko::test::generate_random_matrix<gko::matrix::Csr<double, int>, std::uniform_int_distribution, std::normal_distribution, std::discard_block_engine<std::subtract_with_carry_engine<unsigned long, 48ul, 5ul, 12ul>, 389ul, 11ul>&> (num_rows=128, num_cols=128, nonzero_dist=..., value_dist=..., engine=..., exec=...)

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/ginkgo-1.3.0/core/test/utils/matrix_generator.hpp:116

17 0x000000000047bf3c in (anonymous namespace)::Jacobi::initialize_data (this=0x56bf60, block_pointers=..., block_precisions=..., condition_numbers=..., max_block_size=32, min_nnz=100, max_nnz=111,

num_rhs=1, accuracy=0.10000000000000001) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/ginkgo-1.3.0/omp/test/preconditioner/jacobi_kernels.cpp:84

18 0x0000000000480908 in (anonymous namespace)::Jacobi_OmpApplyEquivalentToRefWithBlockSize32_Test::TestBody (this=0x56bf60)

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/ginkgo-1.3.0/omp/test/preconditioner/jacobi_kernels.cpp:357

19 0x00000000004f71b4 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x56bf60, method=&virtual testing::Test::TestBody(), location=0x513163 "the test body")

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2437

20 0x00000000004f1941 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x56bf60, method=&virtual testing::Test::TestBody(), location=0x513163 "the test body")

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2473

21 0x00000000004d3c56 in testing::Test::Run (this=0x56bf60) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2511

22 0x00000000004d450c in testing::TestInfo::Run (this=0x56d750) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2687

23 0x00000000004d4b5d in testing::TestCase::Run (this=0x56c530) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2805

24 0x00000000004de874 in testing::internal::UnitTestImpl::RunAllTests (this=0x56c2a0)

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:5131

--Type for more, q to quit, c to continue without paging--

25 0x00000000004f830d in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x56c2a0,

method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x4de5ce <testing::internal::UnitTestImpl::RunAllTests()>,
location=0x513b58 "auxiliary test code (environments or event listeners)") at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2437

26 0x00000000004f2837 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x56c2a0,

method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x4de5ce <testing::internal::UnitTestImpl::RunAllTests()>,
location=0x513b58 "auxiliary test code (environments or event listeners)") at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:2473

27 0x00000000004dd548 in testing::UnitTest::Run (this=0x559f30 <testing::UnitTest::GetInstance()::instance>)

at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest.cc:4740

28 0x00000000004ccac6 in RUN_ALL_TESTS () at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/include/gtest/gtest.h:2333

29 0x00000000004cca54 in main (argc=1, argv=0x7fffffff2ad8) at /tmp/vsc40023/easybuild_build/Ginkgo/1.3.0/GCC-10.2.0/easybuild_obj/third_party/gtest/src/googletest/src/gtest_main.cc:36

upsj commented 3 years ago

That is great information! The valgrind warnings should be false positives, but it looks like we are having issues with concurrent allocations, though I am not 100% sure why that would lead to a crash. Are your gcc and glibc built from source or packaged? Any special configuration options? I think with that information, I might hopefully finally be able to reproduce your build environment completely.

Oh, and could you maybe check if the same issue pops up with our develop branch? I somehow suspect this issue might still be present in our current code base.

upsj commented 3 years ago

Now that I think about it some more, as long as we use the same version of glibc, pthreads, libgomp etc, it should be enough if you send me the the shared libraries and test executable.

boegel commented 3 years ago

@upsj See attached tarball, compiled for CentOS 7 and Intel Cascade Lake:

$ tar xfvz ginkgo_issue732_cascadelake_jacobi_kernels.tar.gz
$ export LD_LIBRARY_PATH=$PWD/ginkgo_issue732_cascadelake_jacobi_kernels
$ ginkgo_issue732_cascadelake_jacobi_kernels/jacobi_kernels
[==========] Running 39 tests from 1 test case.
...
[ RUN      ] Jacobi.OmpPreconditionerEquivalentToRefWithMPW
Segmentation fault

ginkgo_issue732_cascadelake_jacobi_kernels.tar.gz

upsj commented 3 years ago

@boegel Thanks, I am able to reproduce the issue now. Would it be possible to send me Debug binaries without -march=native? It looks like something is randomly breaking internal data structure, and march=native doesn't play nice with older valgrind.

boegel commented 3 years ago

Are your gcc and glibc built from source or packaged? Any special configuration options?

GCC 10.2 is built from source, and not exactly a standard build. The full build procedure is defined by the Python "script" in https://github.com/easybuilders/easybuild-easyblocks/blob/develop/easybuild/easyblocks/g/gcc.py, it involves also building with ISL, ClooG + OpenMP offload support.

Oh, and could you maybe check if the same issue pops up with our develop branch? I somehow suspect this issue might still be present in our current code base.

Yes, will do.

Would it be possible to send me Debug binaries without -march=native?

Yes, no problem, I'll look into that.

upsj commented 3 years ago

Oh, in that case, maybe it is easier if I just use Easybuild to compile gcc and see if I run into the same issue.

boegel commented 3 years ago

@upsj That should be... easy, yes. ;)

You should be able to reproduce the problem using this easyconfig file:

easyblock = 'CMakeNinja'

name = 'Ginkgo'
version = '1.3.0'

homepage = 'https://ginkgo-project.github.io'
description = """Ginkgo is a high-performance linear algebra library for manycore systems, with a focus on
 sparse solution of linear systems."""

toolchain = {'name': 'GCC', 'version': '10.2.0'}
toolchainopts = {'debug': True, 'noopt': True}

source_urls = ['https://github.com/ginkgo-project/ginkgo/archive/refs/tags']
sources = ['v%(version)s.tar.gz']
checksums = ['1b0e907b4046cdf7cef16d1730c12ba812b38f2764f49f74f454239a27f63596']

builddependencies = [
    ('CMake', '3.18.4'),
    ('Ninja', '1.10.1'),
]

buildopts = " && ninja test"

sanity_check_paths = {
    'files': [],
    'dirs': [],
}

moduleclass = 'lib'

The sanity_check_paths is empty, which is not correct, but since the tests fail it'll never get to the point where it'll complain about it being empty.

Short instructions:

# install EasyBuild (feel free to adjust as needed with `--user`, `--prefix` or installing in a virtualenv)
pip3 install easybuild
# install Ginkgo and all dependencies (incl. GCC 10.2.0)
eb Ginkgo.eb --robot

(where Ginkgo.eb is a local easyconfig file with the contents shown above)

If you need help, you know where to find me (EasyBuild Slack).

boegel commented 3 years ago

The tarball with the debug binary/libraries (built with -g -O0) is too big for GitHub, so I've uploaded it here: https://users.ugent.be/~kehoste/ginkgo_issue732_cascadelake_jacobi_kernels_debug.tar.gz .

Same issue with current develop branch (commit 94e2361), and actually a couple more failing tests:

The following tests FAILED:
     63 - omp/test/matrix/csr_kernels (SEGFAULT)
     70 - omp/test/preconditioner/jacobi_kernels (SEGFAULT)
     72 - omp/test/reorder/rcm_kernels (SEGFAULT)
    102 - core/test/base/executor (Failed)
upsj commented 3 years ago

I am zeroing in closer and closer on this bug, but it seems to be something really fundamental and weird: The reference counters in std::shared_ptr become zero too early, leading to a use-after-free in OmpExecutor or std::shared_ptr's _M_refcount.

A minimal reproducer is

int main()
{
    auto omp = gko::OmpExecutor::create();
    gko::kernels::omp::jacobi::generate(omp);
}

omp/preconditioner/jacobi_kernels.cpp

void generate(std::shared_ptr<const OmpExecutor> exec)
{
#pragma omp parallel for
    for (size_type g = 0; g < 100; g++) {
        auto other = exec;
    }
}

Note that this behavior only occurs with OMP_NUM_THREADS larger than 1 and using GCC 10.2 built from source with EasyBuild.

I will try to reduce the input some more, but it is starting to sound more and more like a compiler bug.

boegel commented 3 years ago

FWIW: The problem is still there in GCC 10.2.1 (first release candidate for GCC 10.3), so if you can confirm this as a compiler bug, it may be worth opening a bug report to GCC?

boegel commented 3 years ago

Another small bit of info: I've built GCC 10.2.0 without OpenMP offload support (withnvptx = False), and the segfault problem stays...

upsj commented 3 years ago

Just to have everything documented here, I have reduced the same issue down to a reproducer that is independent of Ginkgo: testlib.cpp:

#include <memory>

void foo(std::shared_ptr<int> f)
{
#pragma omp parallel for
    for (size_t g = 0; g < 100; g++) {
        auto other = f;
    }
}

tester.cpp

#include <memory>

void foo(std::shared_ptr<int> f);

int main() {
        foo(std::make_shared<int>(4));
}

Compilation

g++ -g -o tester.cpp.o -c tester.cpp
g++ -g -fPIC -fopenmp -o testlib.cpp.o -c testlib.cpp
g++ -fPIC -g -shared -o libtestlibd.so testlib.cpp.o -lgomp -lpthread
g++ -g tester.cpp.o -o tester -Wl,-rpath,`pwd` libtestlibd.so

The execution spuriously crashes with various memory-related errors (pure virtual function called, corrupted double-linked list, Segfault, ...) on GCC 10.2 built using easybuild