Open Quuxplusone opened 11 years ago
Attached polynomials.cpp
(4247 bytes, text/plain): Original source code
Attached polynomials-34EC0D.sh
(676 bytes, text/plain): CLANG-generated script
Some observations:
1) The crash is because the recursive template instantiation is blowing out the stack. (This happens more for constexpr functions because they need to be instantiated more eagerly, in case they are needed to evaluate a constant expression.) Try setting a higher stack ulimit. On my system, 8MB is not enough, but 16MB is.
2) The definition of LegendreDot is invalid, because it's marked as constexpr but calls pow, which is not (required to be) a constexpr function.
3) You've marked a bunch of functions as 'constexpr', but they won't be called at compile time (by either g++ or clang), because you're not calling them in a context where constant evaluation is performed. Removing these redundant keywords will also reduce the stack usage and make the crash go away.
4) Each of your algorithms is O(fib(N)).
Thank you, Richard.
It is very helpful to know how to increase the stack size (I had looked for such instruction before, but I may have overlooked it).
Thank you also for pointing out the correct use of constexpr. I will remove such calls, increase the stack size, and try again.
Finally, there are many (more efficient) ways to compute polynomials for large N, but I just wanted to "stress" the compiler to see what I could get away with while still retaining the optimization and (possibly) reduce the round-off error.
Cheers,
Juan.
polynomials-34EC0D.sh
(676 bytes, text/plain)polynomials.cpp
(4247 bytes, text/plain)This program uses a template-based definition of some common polynomials, and them attempts to evaluate them at large orders. While it may seem artificial to use a 200th-order polynomial, it is quite common in cases like geopotential evaluation using spherical harmonics.
There are other means to evaluate the geopotential function, but this program aims to test whether the compiler optimizations made available by means of templates can expedite the computations and reduce roundoff errors.
This program uses the
constexpr
specifier (which marks a function as a compile-time target). For this reason, only a compiler with C++11 support will be able to compile.The program runs extremely slowly when compiled with g++ with N=100. It runs great when compiled with clang++ up to N~170. Compilation crashes in clang++ at N=192. The last successful compilation looks as follows:
Attempting the next compilation:
yields the following result:
clang: error: unable to execute command: Segmentation fault: 11 clang: error: clang frontend command failed due to signal (use -v to see invocation) clang version 3.2 (tags/RELEASE_32/final) Target: x86_64-apple-darwin11.4.2 Thread model: posix clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script. clang: note: diagnostic msg:
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang: note: diagnostic msg: /var/folders/zz/zyxvpxvq6csfxvn_n0001dkm000bcn/T/polynomials-zXGdTS.cpp clang: note: diagnostic msg: /var/folders/zz/zyxvpxvq6csfxvn_n0001dkm000bcn/T/polynomials-zXGdTS.sh clang: note: diagnostic msg:
clang++ version information: clang version 3.2 (tags/RELEASE_32/final) Target: x86_64-apple-darwin11.4.2 Thread model: posix
g++ version information: Using built-in specs. COLLECT_GCC=/usr/local/bin/g++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin11.4.2/4.7.2/lto-wrapper Target: x86_64-apple-darwin11.4.2 Configured with: ./configure Thread model: posix gcc version 4.7.2 (GCC)
System information: OS X 10.7.4 Processor: 2.7 GHz Intel Core i7 Memory: 16GB 1600 MHz DDR3