Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Segmentation fault when compiling function template with deep recursion #15560

Open Quuxplusone opened 11 years ago

Quuxplusone commented 11 years ago
Bugzilla Link PR15560
Status NEW
Importance P normal
Reported by Juan Arrieta (Juan.Arrieta@jpl.nasa.gov)
Reported on 2013-03-21 11:17:29 -0700
Last modified on 2013-03-21 16:26:46 -0700
Version 3.2
Hardware Macintosh MacOS X
CC dgregor@apple.com, fang@csl.cornell.edu, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments polynomials-34EC0D.sh (676 bytes, text/plain)
polynomials.cpp (4247 bytes, text/plain)
Blocks
Blocked by
See also

This program uses a template-based definition of some common polynomials, and them attempts to evaluate them at large orders. While it may seem artificial to use a 200th-order polynomial, it is quite common in cases like geopotential evaluation using spherical harmonics.

There are other means to evaluate the geopotential function, but this program aims to test whether the compiler optimizations made available by means of templates can expedite the computations and reduce roundoff errors.

This program uses the constexpr specifier (which marks a function as a compile-time target). For this reason, only a compiler with C++11 support will be able to compile.

The program runs extremely slowly when compiled with g++ with N=100. It runs great when compiled with clang++ up to N~170. Compilation crashes in clang++ at N=192. The last successful compilation looks as follows:

   clang++ polynomials.cpp -std=c++11 -O3 -DORDER=191   

Attempting the next compilation:

   clang++ polynomials.cpp -std=c++11 -O3 -DORDER=192

yields the following result:

clang: error: unable to execute command: Segmentation fault: 11 clang: error: clang frontend command failed due to signal (use -v to see invocation) clang version 3.2 (tags/RELEASE_32/final) Target: x86_64-apple-darwin11.4.2 Thread model: posix clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script. clang: note: diagnostic msg:


PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang: note: diagnostic msg: /var/folders/zz/zyxvpxvq6csfxvn_n0001dkm000bcn/T/polynomials-zXGdTS.cpp clang: note: diagnostic msg: /var/folders/zz/zyxvpxvq6csfxvn_n0001dkm000bcn/T/polynomials-zXGdTS.sh clang: note: diagnostic msg:


clang++ version information: clang version 3.2 (tags/RELEASE_32/final) Target: x86_64-apple-darwin11.4.2 Thread model: posix

g++ version information: Using built-in specs. COLLECT_GCC=/usr/local/bin/g++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin11.4.2/4.7.2/lto-wrapper Target: x86_64-apple-darwin11.4.2 Configured with: ./configure Thread model: posix gcc version 4.7.2 (GCC)

System information: OS X 10.7.4 Processor: 2.7 GHz Intel Core i7 Memory: 16GB 1600 MHz DDR3

Quuxplusone commented 11 years ago

Attached polynomials.cpp (4247 bytes, text/plain): Original source code

Quuxplusone commented 11 years ago

Attached polynomials-34EC0D.sh (676 bytes, text/plain): CLANG-generated script

Quuxplusone commented 11 years ago

Some observations:

1) The crash is because the recursive template instantiation is blowing out the stack. (This happens more for constexpr functions because they need to be instantiated more eagerly, in case they are needed to evaluate a constant expression.) Try setting a higher stack ulimit. On my system, 8MB is not enough, but 16MB is.

2) The definition of LegendreDot is invalid, because it's marked as constexpr but calls pow, which is not (required to be) a constexpr function.

3) You've marked a bunch of functions as 'constexpr', but they won't be called at compile time (by either g++ or clang), because you're not calling them in a context where constant evaluation is performed. Removing these redundant keywords will also reduce the stack usage and make the crash go away.

4) Each of your algorithms is O(fib(N)).

Quuxplusone commented 11 years ago

Thank you, Richard.

It is very helpful to know how to increase the stack size (I had looked for such instruction before, but I may have overlooked it).

Thank you also for pointing out the correct use of constexpr. I will remove such calls, increase the stack size, and try again.

Finally, there are many (more efficient) ways to compute polynomials for large N, but I just wanted to "stress" the compiler to see what I could get away with while still retaining the optimization and (possibly) reduce the round-off error.

Cheers,

Juan.