FLAMEGPU / FLAMEGPU2

FLAME GPU 2 is a GPU accelerated agent based modelling framework for CUDA C++ and Python
https://flamegpu.com
MIT License
106 stars 22 forks source link

Warnigns to fix (aarch64) #1176

Closed ptheywood closed 9 months ago

ptheywood commented 10 months ago

Warnigns/notes encountered on ARM (GH200), with CUDA 12.3 (RTC perf bad still) and GCC 11.4.1 on Rocky 9 for 2.0.0-rc.1

2 Warnings building examples and test suite (i.e. not python) with seatbelts and without MPI.

/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh: In instantiation of ‘std::pair<double, double> flamegpu::HostAgentAPI::meanStandardDeviation(const string&) const [with InT = float; std::string = std::__cxx11::basic_string<char>]’:
/users/pheywood/flamegpu/FLAMEGPU2/examples/cpp/diffusion/src/main.cu:41:75:   required from here
/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh:561:1: note: parameter passing for argument of type ‘std::pair<double, double>’ when C++17 is enabled changed to match C++14 in GCC 10.1
  561 | std::pair<double, double> HostAgentAPI::meanStandardDeviation(const std::string& variable) const {
      | ^~~~~~~~~~~~
/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh: In instantiation of ‘std::pair<double, double> flamegpu::HostAgentAPI::meanStandardDeviation(const string&) const [with InT = float; std::string = std::__cxx11::basic_string<char>]’:
/users/pheywood/flamegpu/FLAMEGPU2/tests/test_cases/runtime/agent/host_reduction/test_mean_standarddeviation.cu:6:74:   required from here
/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh:561:1: note: parameter passing for argument of type ‘std::pair<double, double>’ when C++17 is enabled changed to match C++14 in GCC 10.1
  561 | std::pair<double, double> HostAgentAPI::meanStandardDeviation(const std::string& variable) const {
      | ^~~~~~~~~~~~

During compilation of pyflamegpu

/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh: In instantiation of ‘std::pair<double, double> flamegpu::HostAgentAPI::meanStandardDeviation(const string&) const [with InT = float; std::string = std::__cxx11::basic_string<char>]’:
/users/pheywood/flamegpu/FLAMEGPU2/build-gh-123/swig/python/pyflamegpu/flamegpuPYTHON_wrap.cxx:184247:82:   required from here
/users/pheywood/flamegpu/FLAMEGPU2/include/flamegpu/runtime/agent/HostAgentAPI.cuh:561:1: note: parameter passing for argument of type ‘std::pair<double, double>’ when C++17 is enabled changed to match C++14 in GCC 10.1
  561 | std::pair<double, double> HostAgentAPI::meanStandardDeviation(const std::string& variable) const {

This is a note rather than a warning, so don't need to change code, just ignore the note about a conformance change?

Probably best to do this with a tight scope though, as generally enabling -Wno-psabi feels like it could mask actual problems in the future.

I.e. (with extra bits to only do this on gcc)

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpsabi"
...
#pragma GCC diagnostic pop
Robadob commented 10 months ago

Is it just complaining that it's templated, but the template doesn't affect the prototype (type of args/rval)?

ptheywood commented 10 months ago

No, its a GCC abi compatibiltiy thing for C++17 on some platforms, of which aarch64 is the first one we've compiled for (i.e. not x86_64 or ppc64le).

An ABI incompatibility between C++14 and C++17 has been fixed. On some targets a class with a zero-sized subobject would be passed incorrectly when compiled as C++17 or C++20. See the C++ notes below for more details.

The ABI of passing and returning certain C++ classes by value changed on several targets in GCC 10, including AArch64, ARM, PowerPC ELFv2, S/390 and Itanium. These changes affect classes with a zero-sized subobject (an empty base class, or data member with the [[no_unique_address]] attribute) where all other non-static data members have the same type (this is called a "homogeneous aggregate" in some ABI specifications, or if there is only one such member, a "single element"). In -std=c++17 and -std=c++20 modes, classes with an empty base class were not considered to have a single element or to be a homogeneous aggregate, and so could be passed differently (in the wrong registers or at the wrong stack address). This could make code compiled with -std=c++17 and -std=c++14 ABI incompatible. This has been corrected and the empty bases are ignored in those ABI decisions, so functions compiled with -std=c++14 and -std=c++17 are now ABI compatible again. Example: struct empty {}; struct S : empty { float f; }; void f(S);. Similarly, in classes containing non-static data members with empty class types using the C++20 [[no_unique_address]] attribute, those members weren't ignored in the ABI argument passing decisions as they should be. Both of these ABI changes are now diagnosed with -Wpsabi.

source

ptheywood commented 9 months ago
#pragma GCC diagnostic ignored "-Wpsabi"

Fails to suppress this diagnostic note with GCC 11.4.1-2 (and 12.2.0) on aarch64, when placed around the function declaration, the templatd specialisation or the calling site.

e.g.

// Suppress GCC >= 10.1 diagnostic due to ABI change in C++17 mode on aarch64 for parameter passing of std::pair<double, double>
#if defined(__GNUC__) && (( __GNUC__ == 10 && defined(__GNUC_MINOR__) && __GNUC_MINOR__ >= 1) || __GNUC__ > 10)
    #pragma GCC diagnostic push
    #pragma GCC diagnostic ignored "-Wpsabi"
#endif
template<typename InT>
std::pair<double, double> HostAgentAPI::meanStandardDeviation(const std::string& variable) const {
    std::pair<double, double> rtn;
    meanStandardDeviation_async<InT>(variable, rtn, this->api.stream, this->api.streamId);
    gpuErrchk(cudaStreamSynchronize(this->api.stream));  // Redundant, meanStandardDeviation_async() is not truly async
    return rtn;
}
#if defined(__GNUC__) && (( __GNUC__ == 10 && defined(__GNUC_MINOR__) && __GNUC_MINOR__ >= 1) || __GNUC__ > 10)
    #pragma GCC diagnostic pop
#endif

I'd rather not globally enable -Wno-psabi on aarch with gcc > 10.1 via cmake, as that might suppress actually important warnings in the future, but that seems like it might be the only option (or to remember to do this manually when building on aarch).

As this is is either needed in a header, or in the per-model source file not sure I can scope -Wno-psabi to only apply to a single file via CMake as a compromise either.