EnzymeAD / Enzyme

High-performance automatic differentiation of LLVM and MLIR.
https://enzyme.mit.edu
Other
1.25k stars 104 forks source link

`std::vector.push_back()` causes segementation fault in Enzyme #1910

Closed ipcamit closed 1 week ago

ipcamit commented 3 months ago

When I have following code in my function I get enzyme segfault (see below):

     std::vector<int> species_j_indices;
        for (int i = 0; i < number_of_neighbors; i++) {
            if (species[neighbor_lists[i]] == zj) {
                species_j_indices.push_back(i);
                n_neighbors_zj++;
            }
        }

Whereas the following workaround, which does not use std::vector.push_back() works fine

        std::vector<int> species_j_indices(MAX_NEIGHBORS, -1);
        int i_neigh_j = 0;
        for (int i = 0; i < number_of_neighbors; i++) {
            if (species[neighbor_lists[i]] == zj) {
                species_j_indices[i_neigh_j] = i;
                n_neighbors_zj++;
                i_neigh_j++;
            }
        }

Also I see lot of activity on C++ sugar for enzyme. But the branch seem to be deleted. Is there someplace I can start? Any way I can help (documentation/ examples etc)?

stack trace of segfault:

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /opt/llvm_13/clang_13_prebuilt/bin/ld.lld -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -shared -o bin/libdescriptor.so /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/opt/llvm_13/clang_13_prebuilt/bin/../lib -L/lib -L/usr/lib -plugin-opt=mcpu=x86-64 -plugin-opt=O3 obj/Xi.o obj/Bispectrum.o obj/Descriptors.o obj/helper.o obj/SOAP.o obj/SymmetryFunctions.o obj/maths/spherical_harmonics.o obj/maths/gamma.o obj/maths/clebsh_gordon.o obj/maths/bessel_functions.o obj/maths/radial_basis_functions.o obj/maths/precomputed_bessel_zeros.o obj/maths/gl_quad.o --lto-legacy-pass-manager -mllvm=-load=/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so -mllvm=-enzyme-loose-types -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
1.      Running pass 'Enzyme Pass' on module 'ld-temp.o'.
 #0 0x0000000002576d63 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x2576d63)
 #1 0x0000000002574d4e llvm::sys::RunSignalHandlers() (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x2574d4e)
 #2 0x000000000257734f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007fc1d8efe420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x0000000004d4d7d3 isGuaranteedNotToBeUndefOrPoison(llvm::Value const*, llvm::AssumptionCache*, llvm::Instruction const*, llvm::DominatorTree const*, unsigned int, bool) ValueTracking.cpp:0:0
 #5 0x0000000004d4d612 isGuaranteedNotToBeUndefOrPoison(llvm::Value const*, llvm::AssumptionCache*, llvm::Instruction const*, llvm::DominatorTree const*, unsigned int, bool) ValueTracking.cpp:0:0
 #6 0x000000000454ee86 runImpl(llvm::Function&, llvm::LazyValueInfo*, llvm::DominatorTree*, llvm::SimplifyQuery const&) CorrelatedValuePropagation.cpp:0:0
 #7 0x000000000454c954 llvm::CorrelatedValuePropagationPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x454c954)
 #8 0x00007fc1d856bd49 PreProcessCache::optimizeIntermediate(llvm::Function*) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xa6bd49)
 #9 0x00007fc1d84ae5ad EnzymeLogic::CreatePrimalAndGradient(RequestContext, ReverseCacheKey const&&, TypeAnalysis&, AugmentedReturn const*, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0x9ae5ad)
#10 0x00007fc1d8602847 GradientUtils::GetOrCreateShadowFunction(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Function*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb02847)
#11 0x00007fc1d86003d8 GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb003d8)
#12 0x00007fc1d860048c GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb0048c)
#13 0x00007fc1d860000a GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb0000a)
#14 0x00007fc1d8600185 GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb00185)
#15 0x00007fc1d8600e4c GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb00e4c)
#16 0x00007fc1d860048c GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb0048c)
#17 0x00007fc1d860048c GradientUtils::GetOrCreateShadowConstant(RequestContext, EnzymeLogic&, llvm::TargetLibraryInfo&, TypeAnalysis&, llvm::Constant*, DerivativeMode, unsigned int, bool) (/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so+0xb0048c)
#18 0x00007fc1d843290d (anonymous namespace)::EnzymeBase::lowerEnzymeCalls(llvm::Function&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*> >&) Enzyme.cpp:0:0
#19 0x00007fc1d842d744 (anonymous namespace)::EnzymeBase::run(llvm::Module&) Enzyme.cpp:0:0
#20 0x00007fc1d842ce21 (anonymous namespace)::EnzymeOldPM::runOnModule(llvm::Module&) Enzyme.cpp:0:0
#21 0x000000000507a204 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x507a204)
#22 0x0000000003c48845 llvm::lto::opt(llvm::lto::Config const&, llvm::TargetMachine*, unsigned int, llvm::Module&, bool, llvm::ModuleSummaryIndex*, llvm::ModuleSummaryIndex const*, std::vector<unsigned char, std::allocator<unsigned char> > const&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x3c48845)
#23 0x0000000003c497ad llvm::lto::backend(llvm::lto::Config const&, std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x3c497ad)
#24 0x0000000003c3cc09 llvm::lto::LTO::runRegularLTO(std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x3c3cc09)
#25 0x0000000003c3c492 llvm::lto::LTO::run(std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)>, std::function<std::function<std::unique_ptr<llvm::lto::NativeObjectStream, std::default_delete<llvm::lto::NativeObjectStream> > (unsigned int)> (unsigned int, llvm::StringRef)>) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x3c3c492)
#26 0x0000000002700d54 lld::elf::BitcodeCompiler::compile() (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x2700d54)
#27 0x000000000266e816 void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true> >() (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x266e816)
#28 0x000000000265a07d void lld::elf::LinkerDriver::link<llvm::object::ELFType<(llvm::support::endianness)1, true> >(llvm::opt::InputArgList&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x265a07d)
#29 0x000000000264c818 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x264c818)
#30 0x0000000002649ffb lld::elf::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&, llvm::raw_ostream&) (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x2649ffb)
#31 0x00000000024d12c1 lldMain(int, char const**, llvm::raw_ostream&, llvm::raw_ostream&, bool) lld.cpp:0:0
#32 0x00000000024d0b94 main (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x24d0b94)
#33 0x00007fc1d8972083 __libc_start_main /build/glibc-e2p3jK/glibc-2.31/csu/../csu/libc-start.c:342:3
#34 0x00000000024d071e _start (/opt/llvm_13/clang_13_prebuilt/bin/ld.lld+0x24d071e)
clang-13: error: unable to execute command: Segmentation fault (core dumped)
clang-13: error: linker command failed due to signal (use -v to see invocation)
make: *** [Makefile:27: bin/libdescriptor.so] Error 254
wsmoses commented 3 months ago

Mind posting a full code file that fails (or better a link on enzyme.mit.edu/explorer).

This should be a quick ix, but ideally we can test if it now works (and put into test suite)

ipcamit commented 3 months ago

Here is the link to minimal example: https://fwd.gymni.ch/sj7D3L

I am not sure how to enable LTO in enzyme explorer, but it fails when I run

clang++  -flto -std=c++17 -O3 -shared -fpic -fuse-ld=lld -flto  Descriptors.cpp  -Wl,--lto-legacy-pass-manager -Wl,-mllvm=-load=/opt/enzyme/enzyme/build/Enzyme/LLDEnzyme-13.so -Wl,-mllvm=-enzyme-loose-types

If I run in compiler mode, it works fine. I have also given the version that compiles fine.

wsmoses commented 1 week ago

@ipcamit I can't reproduce this locally on current main?

Can you check if it still fails for you?

ipcamit commented 1 week ago

With the latest main branch (commit #37aa378) this problem no longer exists. Great! I will close this issue.