llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.07k stars 11.99k forks source link

only evaluate values of non-type template arguments once #12925

Open llvmbot opened 12 years ago

llvmbot commented 12 years ago
Bugzilla Link 12553
Version trunk
OS MacOS X
Attachments metaprogramming torture test to compute prime numbers
Reporter LLVM Bugzilla Contributor
CC @AlisdairM,@asl,@DougGregor,@zygoloid

Extended Description

The attached program is one of my recreational C++11 programs, it tries to compute a large array of prime numbers entirely at compile time, and has been optimized to produce the largest number possible in the shortest compile-time, minimizing compiler resources.

As currently configured, it compiles on gcc 4.7 in approximately one second, but with my current version of Clang in just over 4 minutes. Configuration is essentially one constant towards the end of the file that determines the maximum range of numbers to search. The limiting factor with GCC is to avoid swap-file thrashing, which allows for growing by a factor of another 6 powers of 2.

The Clang version is clearly CPU bound, using only around 50Mb RAM in the attached configuration and not coming close to swapping to page file.

Clang configuration: clang version 3.1 (trunk 153754) Target: x86_64-apple-darwin11.3.0 Thread model: posix

Running on OSX Lion.

e84ee981-20b7-441e-8de0-adb2190144d6 commented 4 years ago

Tried this again today, and the gcc performance advantage is still present, but much lower - although I had to reduce the test bound from 2 << 14 to 2 << 10 to sit within the default recursion limits in the current compiler. I think this ticket can be closed as resolved, as we are no longer looking at multiple orders of magnitude difference.

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 12 years ago

The problem is that count_primes(16384) is being evaluated a large number of times. We are apparently not storing the evaluated value for a non-type template argument, and are instead re-evaluating it each time we need it.

(240x slower doesn't sound so bad when we're doing 16384x as much work...)

With this patch (and the include of and main removed) clang -fsyntax-only takes 47% less time to handle this code than g++ -fsyntax-only on my machine:

@@ -227,11 +227,12 @@

 template<unsigned N>
 struct PrimeArrayReversed {
-   PrimeArrayReverseImpl< typename GeneratePrimePack<count_primes(N)>::type> const VALUE;
+   static constexpr unsigned Primes = count_primes(N);
+   PrimeArrayReverseImpl< typename GeneratePrimePack<Primes>::type> const VALUE;

    constexpr PrimeArrayReversed() : VALUE{} {}

-   constexpr unsigned operator[](unsigned index) { return VALUE.d_data[count_primes(N)-index]; }
+   constexpr unsigned operator[](unsigned index) { return VALUE.d_data[Primes-index]; }
 };

 #define UNREVERSE_ARRAY
@@ -248,14 +249,14 @@

 template<unsigned N>
 struct PrimeArray {
-   PrimeArrayImpl< N, typename GenerateLogPack<count_primes(N)>::type > const VALUE;
+   PrimeArrayImpl< N, typename GenerateLogPack<PrimeArrayReversed<N>::Primes>::type > const VALUE;

    constexpr PrimeArray() : VALUE(PrimeArrayReversed<N>{}) {}
 };
 #else
 template<unsigned N>
 struct PrimeArray {
-   PrimeArrayReverseImpl<typename GeneratePrimePack<count_primes(N)>::type> const VALUE;
+   PrimeArrayReverseImpl<typename GeneratePrimePack<PrimeArrayReversed<N>::Primes>::type> const VALUE;

    constexpr PrimeArray() : VALUE{} {}
 };
llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-frontend