Closed sarats closed 1 year ago
FWIW, this issue wasn't present with rocm 5.4.0.
Can you add reference to the version of code that you are trying to compile (branch/master/...)? Is ftn (with these modules) able to compile simple helloworld programs?
This branch: https://github.com/E3SM-Project/E3SM/tree/sarats/machines/frontier E3SM tests build and run with just one module change (rocm 5.4.0)
The linking issue is present with a simple standalone program as well. Will report to OLCF/HPE.
Hi, Did you get a resolution to this issue with OLCF/HPE?
@PaulMullowney FYI, two workarounds from Cray/HPE. I used the first method.
AMD started building with GCC 12.2.0, which brings in a GLIBCXX symbol that isn't in CCE's default GCC toolchain.
-Wl,--allow-shlib-undefined
. This will let the link go through (and is the default lld behavior for cc/CC), and then at runtime the newer version will get picked up because of some default gcc runtime paths on Frontier.GCC_X86_64=/opt/cray/pe/gcc/12.2.0/snos
.@sarats Thanks for the pointers! The 2nd solution was easy to test and worked for me. Implementing the first will require a little more work.
Evidently, the new ROCM module would try to link in libhsa-runtime64.so.1 which has an undefined reference to `std::condition_variable::wait(std::unique_lock&)@GLIBCXX_3.4.30.
What's the best way to handle this? Add -l stdc++ to FFLAGS or something else? What's a good way to just pass this for Scorpio build in CIME?