llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.18k stars 12.04k forks source link

[consteval] The mixed use of 'consteval' and 'constexpr' takes longer time to compile #62947

Open ChuanqiXu9 opened 1 year ago

ChuanqiXu9 commented 1 year ago

The reproducer:

// Fib.cpp
namespace Fibonacci
{
    constexpr unsigned long Recursive(unsigned long n)
    {
        if (n == 0)
            return 0;
        if (n == 1)
            return 1;
        return Recursive(n - 2) + Recursive(n - 1);
    }

    template<unsigned long N>
    struct Number{};

    struct DefaultStrategy
    {
        constexpr unsigned long operator()(unsigned long n, auto... other) const
        {
            return (n + ... + other);
        }
    };

    template<unsigned long N, typename Strategy = DefaultStrategy>
    constexpr unsigned long Cache = Compute(Number<N>{}, Strategy{});
}

namespace Fibonacci
{
    constexpr unsigned long Compute(Number<30ul>, auto strategy)
    {
        return strategy(Recursive(30ul));
    }

    template constexpr unsigned long Cache<30ul>;
}

Let's run:

time clang++ -std=c++20 -fconstexpr-steps=4294967295 -c  Fib.cpp

in my computer it shows:

real    0m8.070s
user    0m7.968s
sys 0m0.011s

But if I change Recursive into consteval, the result will be:

$ time clang++ -std=c++20 -fconstexpr-steps=4294967295 -c  Fib.cpp

real    0m16.145s
user    0m15.948s
sys 0m0.012s

it shows now it takes double times to compile the source.

llvmbot commented 1 year ago

@llvm/issue-subscribers-c-20

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-frontend

shafik commented 1 year ago

CC folks who might know where the main culprit is or may know how the determine it @erichkeane @zygoloid @tbaederr

erichkeane commented 1 year ago

I don't have much familiarity with the consteval code, @AaronBallman and @Fznamznon are the two who have been spending the most time doing that lately.

kaimfrai commented 1 year ago

I don't know if this is helpful, but I ran the example with -ftime-trace and put the .json into speedscope, and interestingly it's not that something is really slower but rather it's simply computed twice.

The constexpr case has

EvaluateAsInitializer {"detail":"Fibonacci::Cache"}
> Frontend
> ExecuteCompiler

The consteval case has

EvaluateAsConstantExpr {"detail":"<Fib.cpp:39:19, col:33>"}
> InstantiateFunction {"detail":"Fibonacci::Compute<Fibonacci::DefaultStrategy>"}
> Frontend
> ExecuteCompiler

as well as

 EvaluateAsConstantExpr {"detail":"<Fib.cpp:39:19, col:33>"}
> Frontend
> ExecuteCompiler

both taking roughly the same time as the single constexpr case. So really, only EvaluateAsConstantExpr is performed twice which doubles the compile time.

cor3ntin commented 1 year ago

Seems similar to #61425

AaronBallman commented 1 year ago

Agreed, this seems to be a duplicate of #61425.

zygoloid commented 1 year ago

I don't think this is a duplicate. #61425 would cover the repeated calls to Recursive not being cached like GCC does, but I don't think that's what this bug is about -- I think this bug is about an immediate invocation being evaluated twice. And that's a real bug -- an immediate invocation will eventually be able to have side effects, such as emitting messages at compile time or generating AST nodes, once more consteval primitives are added to the standard library, so we really need to only evaluate them once.

Presumably the issue here is that the constant evaluator is stepping into ConstantExprs rather than picking up and using their stored value.

AaronBallman commented 1 year ago

Oh, thank you for observing that! I've reopened the issue.

cor3ntin commented 1 year ago

@zygoloid Thanks!

RemoveNestedImmediateInvocation uses a TreeTransform. I think maybe this should be replaced with a recursive visitor of some sort? Why are we using a transform when all we want to do is to remove entries from ReferenceToConsteval? There TreeTranform does appear to get rid of all ConstantExpr nodes, so i suspect this is the source of the bug.

zygoloid commented 1 year ago

IIUC the TreeTransform is only used in cases like

consteval int id(int n) { return n; }
int x = id(id(2));

where we'd otherwise have two nested ConstantExpr nodes and want only one (because the inner one isn't a full-expression).

The way I think this was supposed to work is that we build the inner call and its ConstantExpr wrapper, then build the outer call and its ConstantExpr wrapper, then at the end of the full-expression if we have more than one ConstantExpr we do a TreeTransform to strip out all the nested ones, and finally compute and store the constant value for any that remain.