Closed ptheywood closed 9 months ago
Is it just complaining that it's templated, but the template doesn't affect the prototype (type of args/rval)?
No, its a GCC abi compatibiltiy thing for C++17 on some platforms, of which aarch64 is the first one we've compiled for (i.e. not x86_64 or ppc64le).
An ABI incompatibility between C++14 and C++17 has been fixed. On some targets a class with a zero-sized subobject would be passed incorrectly when compiled as C++17 or C++20. See the C++ notes below for more details.
The ABI of passing and returning certain C++ classes by value changed on several targets in GCC 10, including AArch64, ARM, PowerPC ELFv2, S/390 and Itanium. These changes affect classes with a zero-sized subobject (an empty base class, or data member with the [[no_unique_address]] attribute) where all other non-static data members have the same type (this is called a "homogeneous aggregate" in some ABI specifications, or if there is only one such member, a "single element"). In -std=c++17 and -std=c++20 modes, classes with an empty base class were not considered to have a single element or to be a homogeneous aggregate, and so could be passed differently (in the wrong registers or at the wrong stack address). This could make code compiled with -std=c++17 and -std=c++14 ABI incompatible. This has been corrected and the empty bases are ignored in those ABI decisions, so functions compiled with -std=c++14 and -std=c++17 are now ABI compatible again. Example: struct empty {}; struct S : empty { float f; }; void f(S);. Similarly, in classes containing non-static data members with empty class types using the C++20 [[no_unique_address]] attribute, those members weren't ignored in the ABI argument passing decisions as they should be. Both of these ABI changes are now diagnosed with -Wpsabi.
#pragma GCC diagnostic ignored "-Wpsabi"
Fails to suppress this diagnostic note with GCC 11.4.1-2 (and 12.2.0) on aarch64, when placed around the function declaration, the templatd specialisation or the calling site.
e.g.
// Suppress GCC >= 10.1 diagnostic due to ABI change in C++17 mode on aarch64 for parameter passing of std::pair<double, double>
#if defined(__GNUC__) && (( __GNUC__ == 10 && defined(__GNUC_MINOR__) && __GNUC_MINOR__ >= 1) || __GNUC__ > 10)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpsabi"
#endif
template<typename InT>
std::pair<double, double> HostAgentAPI::meanStandardDeviation(const std::string& variable) const {
std::pair<double, double> rtn;
meanStandardDeviation_async<InT>(variable, rtn, this->api.stream, this->api.streamId);
gpuErrchk(cudaStreamSynchronize(this->api.stream)); // Redundant, meanStandardDeviation_async() is not truly async
return rtn;
}
#if defined(__GNUC__) && (( __GNUC__ == 10 && defined(__GNUC_MINOR__) && __GNUC_MINOR__ >= 1) || __GNUC__ > 10)
#pragma GCC diagnostic pop
#endif
I'd rather not globally enable -Wno-psabi
on aarch with gcc > 10.1 via cmake, as that might suppress actually important warnings in the future, but that seems like it might be the only option (or to remember to do this manually when building on aarch).
As this is is either needed in a header, or in the per-model source file not sure I can scope -Wno-psabi
to only apply to a single file via CMake as a compromise either.
Warnigns/notes encountered on ARM (GH200), with CUDA 12.3 (RTC perf bad still) and GCC 11.4.1 on Rocky 9 for
2.0.0-rc.1
2 Warnings building examples and test suite (i.e. not python) with seatbelts and without MPI.
During compilation of
pyflamegpu
This is a note rather than a warning, so don't need to change code, just ignore the note about a conformance change?
Probably best to do this with a tight scope though, as generally enabling
-Wno-psabi
feels like it could mask actual problems in the future.I.e. (with extra bits to only do this on gcc)