trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.2k stars 563 forks source link

Belos: Clang 14 (with gpus) doesn't like taking a reference to cout in a class variable #10535

Closed jjellio closed 2 years ago

jjellio commented 2 years ago

Bug Report

@trilinos/belos

Description

I've been trying a build of Trilinos that uses Clang 14.x with a gpu target.

This type of error is replicated in many of the solvers. e.g., https://github.com/trilinos/Trilinos/blob/0bd35b1d0298b79ab6e18619e61fa364c84a6267/packages/belos/src/BelosBlockGmresSolMgr.hpp#L340

packages/belos/src/BelosBlockGmresSolMgr.hpp:340:37: error: dynamic initialization is not supported for __device__, __constant__, __shared__, and __managed__ variables.
    static constexpr std::ostream * outputStream_default_ = &std::cout;
                                    ^                       ~~~~~~~~~~

I'm not sure what the best approach is to mitigate this. @brian-kelley any thoughts on a good replacement?

I don't think Sandia has a testbed with Clang 14.x to reproduce this on.

brian-kelley commented 2 years ago

@jjellio I think I've seen this before. I believe the issue is that the pointer is constexpr, but the value it's initialized to is a symbol address that's dependent on linking or shared library loading (probably what is meant by 'dynamic initialization' in the err msg). So maybe just replacing constexpr with const would fix it?

jjellio commented 2 years ago

No, that won't fix it.

I think, technically, it wants cout to be explicitly instantiated (similar to what you have to do if you want a const std::string in your class).

I'm not sure what the right pattern is to do this. Belos' solvers want a default so they can optionally use a user provided Ostream from ParameterList. It could work to remove the outputStream_default_ and have the constructor(s) take cout directly. The other way may be to change the explicit instantiation code to add the initialization of outputStream_default_ - I believe this is what the error is after - but it's not clear to me that is legal for GPU code. E.g., a class variable that is from the STL. std:cout is likely not const either, so this all seems a bit hacky

brian-kelley commented 2 years ago

Oh I get it, so the constexpr was really being used to 'inline' the definition of a static member?

Also, I now agree that it shouldn't need to be const, since it's only used as the default for Teuchos::RCP<std::ostream> outputStream_; (non-const).

In that case, I do think that replacing every usage of outputStream_default_ with &std::cout is a reasonable fix. I don't think that would ever change in the future, and even if it needed to it wouldn't be that hard to change the 3 occurences per file. Also lets you remove this hacky looking thing for windows:

#if defined(_WIN32) && defined(__clang__)
    static constexpr std::ostream * outputStream_default_ =
       __builtin_constant_p(reinterpret_cast<const std::ostream*>(&std::cout));
#else
iyamazaki commented 2 years ago

We've seen the same errors with ROCm 5.0 on Crusher, @lucbv.

jjellio commented 2 years ago

I can push a patch that seems to work. I've removed the default, and replaced the construct lines with std::cout (And changed the Teuchos::rcp, to a rcpFromRef).

Did you all work around it? If so, it'd be nice to know if you have a better solution.

lucbv commented 2 years ago

We probably just want to temporarily remove the constexpr while clang/llvm/amd fixes the bug reported here: https://reviews.llvm.org/D119615

lucbv commented 2 years ago

My fix is to remove constexpr everywhere but I agree with @jjellio that using const will not work either so I just leave it as static. The bug reported above makes constexpr imply __constant__ when compiling with a device-clang compiler which is a problem on its own but probably not sufficient to solve this particular issue.

jjellio commented 2 years ago

I tried moving them to const, but I think the compiler error is that it doesn't want inline dynamic initialization of a class variable. So, even if you dropped to const, you still really need to add a line in the ETI generated code that explicitly sets the variable to be std::cout (I think). I didn't test that, because hunting down the ETI generator is a pain and it isn't clear that would work if ETI is off.

The patch I've posted does seem to work, but I'll need help in digesting whether the Pull Request testing is working or not (if this is causing actual errors, we have to think about this more).