llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.4k stars 11.74k forks source link

Exhaustive stack usage #43598

Open apolukhin opened 4 years ago

apolukhin commented 4 years ago
Bugzilla Link 44253
Version trunk
OS Linux
CC @apolukhin,@davidbolvansky,@dwblaikie,@efriedma-quic,@francisvm,@zygoloid,@rjmccall

Extended Description

Consider the following example:

struct test {
    test(test&&);

    void Extend(test v);
    int data_[4096];
};

test test_func(test t) {
    t.Extend(static_cast<test&&>(t));
    t.Extend(static_cast<test&&>(t));
    return t;
}

With -std=c++11 -O2 flags Clang uses 32768 bytes of stack, GCC uses 16392 bytes.

Note that adding additional t.Extend(static_cast<test&&>(t)); results in stack usage growth for Clang, while GCC keeps using only 16392 bytes.

Godbolt playground: https://godbolt.org/z/MdaNfa

francisvm commented 4 years ago

Sorry about the delay. I had a quick patch but it seems like Erik addressed this in https://reviews.llvm.org/D74094. I see that this has been reverted due to a ppc failure. I locally re-applied the patch and I see the function using only 16424 bytes.

davidbolvansky commented 4 years ago

Seems like Francis has a patch?

apolukhin commented 4 years ago

The issue affects users of the stackful coroutines by adding very significant memory overhead (each of the thousands coroutines becomes multiple times bigger).

We'd appreciate a fix for 10.0.

rjmccall commented 4 years ago

That should be a fairly localized change that doesn't need a solution to the more general issue with temporaries. Did you have a patch for it, or did you just try adding lifetime markers to the IR?

francisvm commented 4 years ago

Adding lifetime markers for the temp from EmitAnyExprToTemp(E) in EmitCallArg brings the stack size to 16424.

rjmccall commented 4 years ago

Yeah, I think it's a known issue that we don't emit them for temporaries.

efriedma-quic commented 4 years ago

Yes, lifetime markers are missing.

nickdesaulniers commented 1 year ago

I've revived https://reviews.llvm.org/D74094; let's see what issues with that approach may persist, if any.

nickdesaulniers commented 1 year ago

relanded as https://reviews.llvm.org/rGe698695fbbf62e6676f8907665187f2d2c4d814b in clang-18.

nickdesaulniers commented 1 year ago

Reverted: https://reviews.llvm.org/D159346

https://reviews.llvm.org/D74094#4633785

nickdesaulniers commented 1 year ago

The latest C23 draft § 6.2.4 Storage durations of objects

8 A non-lvalue expression with structure or union type, where the structure or union contains a member with array type (including, recursively, members of all contained structures and unions) refers to an object with automatic storage duration and temporary lifetime. 39) Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression ends. Any attempt to modify an object with temporary lifetime results in undefined behavior. An object with temporary lifetime behaves as if it were declared with the type of its value for the purposes of effective type. Such an object need not have a unique address.

Speaking more with @AaronBallman , Aaron found https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1285.htm introduced the current wording into c11. https://wiki.sei.cmu.edu/confluence/display/c/EXP35-C.+Do+not+modify+objects+with+temporary+lifetime talks more about this.

I'm kind of hung up on the specific wording: where the structure or union contains a member with array type. Why would that matter if the aggregate contains an array or not?

EDIT: sorry, I should file a new bug about C. https://web.archive.org/web/20231010182056/https://yarchive.net/comp/struct_return.html talks more about this in regards to arrays within the returned struct.

cramertj commented 1 month ago

The issue affects users of the stackful coroutines by adding very significant memory overhead (each of the thousands coroutines becomes multiple times bigger).

I noticed this today when using coroutines. This function:

Coro<Status> SimplestCoroWithAwait(CoroContext& cx) {
  co_return co_await SimplestCoro(cx);
}

allocates 88 bytes for storage of its stack, but this one:

Coro<Status> SimplestCoroWithTwoAwaits(CoroContext& cx) {
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_await SimplestCoro(cx);
  co_return co_await SimplestCoro(cx);
}

allocates 408 bytes for its coroutine stack.

In 2019 a similar issue was fixed in the Rust compiler resulting in significant space savings: https://github.com/rust-lang/rust/pull/60187

I'm interested in helping contribute to a fix here. Is https://reviews.llvm.org/D74094 the latest discussion on this?

AaronBallman commented 3 weeks ago

I'm interested in helping contribute to a fix here. Is https://reviews.llvm.org/D74094 the latest discussion on this?

That's the latest discussion I'm aware of.