Compilation error on PowerPC64 due to unsupported real kind

sayerhs commented 4 years ago

OpenFAST fails to build on IBM PowerPC64 architecture using GNU GCC 7.4.0 (ORNL Summit) due to unsupported real kind used for QuKi. The compiler error message is

    REAL(DbKi), INTENT(IN)       :: DblNum
            1
Error: Kind -2 not supported for type REAL at (1)

    REAL(QuKi), INTENT(IN)     :: x             ! input
            1
Error: Kind -2 not supported for type REAL at (1)

I can workaround for now by using the following patch. However, this is not a robust fix. Given the coupling using C++ API, I don't believe -DDOUBLE_PRECISION:BOOL=OFF is a viable option.

diff --git a/modules/nwtc-library/src/SingPrec.f90 b/modules/nwtc-library/src/SingPrec.f90
index c8b4bca3..5864d8ff 100644
--- a/modules/nwtc-library/src/SingPrec.f90
+++ b/modules/nwtc-library/src/SingPrec.f90
@@ -35,7 +35,11 @@ INTEGER, PARAMETER              :: B2Ki     = SELECTED_INT_KIND(  4 )
 INTEGER, PARAMETER              :: B4Ki     = SELECTED_INT_KIND(  9 )           !< Kind for four-byte whole numbers
 INTEGER, PARAMETER              :: B8Ki     = SELECTED_INT_KIND( 18 )           !< Kind for eight-byte whole numbers

+#ifdef __GFC_REAL_16__
 INTEGER, PARAMETER              :: QuKi     = SELECTED_REAL_KIND( 20, 500 )     !< Kind for 16-byte, floating-point numbers
+#else
+INTEGER, PARAMETER              :: QuKi     = SELECTED_REAL_KIND( 20, 120 )     !< Kind for 16-byte, floating-point numbers
+#endif
 INTEGER, PARAMETER              :: R8Ki     = SELECTED_REAL_KIND( 14, 300 )     !< Kind for eight-byte floating-point numbers
 INTEGER, PARAMETER              :: SiKi     = SELECTED_REAL_KIND(  6,  30 )     !< Kind for four-byte, floating-point numbers

OpenFAST commit: 5aacf65c

5aacf65c Merge pull request #447 from andrew-platt/f/vc

System information:

sayerhs@login1:~$ lsb_release -d
Description:    Red Hat Enterprise Linux Server release 7.6 (Maipo)
sayerhs@login1:~$ uname -a
Linux login1 4.14.0-115.21.2.el7a.ppc64le #1 SMP Thu May 7 22:22:31 UTC 2020 ppc64le ppc64le ppc64le GNU/Linux
sayerhs@login1:~$ gfortran --version
GNU Fortran (GCC) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

CMake configure command

cmake 
  -DCMAKE_INSTALL_PREFIX:PATH=/ccs/proj/cfd142/exawind/exawind-2020-07/install/gcc/openfast 
  -DCMAKE_BUILD_TYPE=RELEASE 
  -DBUILD_SHARED_LIBS:BOOL=OFF 
  -DFPE_TRAP_ENABLED:BOOL=ON 
  -DUSE_DLL_INTERFACE:BOOL=ON 
  -DBUILD_OPENFAST_CPP_API:BOOL=ON 
  -DYAML_ROOT:PATH=/autofs/nccs-svm1_proj/cfd142/exawind/exawind-2020-07/spack/opt/spack/linux-rhel7-power9le/gcc-7.4.0/yaml-cpp-0.6.2-6gmppkg63e3mubizsarxetpsqavgoqex 
  -DHDF5_ROOT:PATH=/autofs/nccs-svm1_proj/cfd142/exawind/exawind-2020-07/spack/opt/spack/linux-rhel7-power9le/gcc-7.4.0/hdf5-1.10.4-5jqpjwnfoupcjas2fnjbh54t3wkiorq7 
  -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=ON 
  -DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=ON ..

andrew-platt commented 4 years ago

Do you have any configuration information on how gfortran was built on this system (does it include these flags: --disable-libquadmath or --disable-libquadmath-support)?

Or would it be possible to upgrade to gfortran 8.x? From my brief google search, it looks like this was an issue with 7.x on PPC64 and PPC64el. Some info in this thread: https://lists.debian.org/debian-powerpc/2018/04/msg00005.html

sayerhs commented 4 years ago

@andrew-platt As luck would have it, the system is down for maintenance today, so I'll have wait to check on the gfortran build settings when it comes back online in the next day or so.

Regarding using a newer GCC/gfortran version, that might be tricky given that we are building for hybrid architecture and we need to make sure that host compiler, the MPI libraries, and the NVIDIA CUDA versions play nice when building the nalu-wind side of things and to make sure that we use the same compiler+MPI stack for the OpenFAST C++ API to avoid library incompatibilities. Also if 8.x or higher have been built with --disable-libquadmath then we will probably face the same issue. But I'll look into this when the system comes back online.

If we are forced to stick to the well-tested GCC 7.4.0 stack which extensively tested/benchmarked in FY20-Q2 milestone, what would be the impact of the hack I put in to get things to compile? Are there code pathways on the land-based turbine modules that use QuKi and cause issues? Alternately, what would be the cleanest way to disable QuKi kind in OpenFAST?

andrew-platt commented 4 years ago

I was guessing there there might be some additional considerations on changing compilers.

I can't see any reason disabling QuKi would cause issues for your use case. Perhaps we should add a DISABLE_QUAD option to the cmake configuration? I'm a little concerned we might have unintended consequences on other platforms with just checking __GFC_REAL_16__.

sayerhs commented 4 years ago

Yeah, __GFC_REAL_16__ was specific to the gfortran compiler I was dealing with, and probably not the right way to address it. With DISABLE_QUAD you'll have to update all interfaces which provide the r16 variants and their implementations.

sayerhs commented 4 years ago

@andrew-platt I ran the following test program against all available GCC versions on ORNL Summit. It appears that none of the versions available support quadmath. Recompiling our own GCC is not going to work because we want to use the Spectrum-MPI provided by the system.

program test
INTEGER, PARAMETER :: quki = SELECTED_REAL_KIND(20, 500)

write(*,*) "SELECTED_REAL_KIND(20, 500)", quki
end program

$ for ver in 4.8.5 5.4.0 6.4.0 7.4.0 8.1.0 8.1.1 9.1.0 ; do  
    module unload gcc 2>/dev/null
    module load gcc/${ver} 2> /dev/null 
    echo "==> Using GCC version ${ver}"

    gfortran test.F90 && ./a.out
done

==> Using GCC version 4.8.5
GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-37)
Copyright (C) 2015 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 5.4.0
GNU Fortran (GCC) 5.4.0
Copyright (C) 2015 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 6.4.0
GNU Fortran (GCC) 6.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 7.4.0
GNU Fortran (GCC) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 8.1.0
GNU Fortran (GCC) 8.1.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 8.1.1
GNU Fortran (GCC) 8.1.1 20180519
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 SELECTED_REAL_KIND(20, 500)          -2
==> Using GCC version 9.1.0
GNU Fortran (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 SELECTED_REAL_KIND(20, 500)          -2

sayerhs commented 4 years ago

Also here is the output of gfortran -v

Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/autofs/nccs-svm1_sw/summit/gcc/7.4.0/bin/../libexec/gcc/powerpc64le-none-linux-gnu/7.4.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: powerpc64le-none-linux-gnu
Configured with: /sw/summit/gcc/7.4.0/src/gcc-7.4.0/configure --prefix=/sw/summit/gcc/7.4.0 --build=powerpc64le-none-linux-gnu --host=powerpc64le-none-linux-gnu --target=powerpc64le-none-linux-gnu --enable-offload-targets=nvptx-none --with-cuda-driver=/sw/summit/cuda/9.2.148 --disable-multilib CFLAGS=-m64
Thread model: posix
gcc version 7.4.0 (GCC)

andrew-platt commented 4 years ago

@sayerhs Thanks for the update! I'll look into the best way to add it as a CMake flag. For now, I suspect your method should work without introducing any numerical issues.

Did you have to disable any module procedure interfaces within the nwtc_library to get it to compile?

sayerhs commented 4 years ago

@andrew-platt With the hack I show above, I didn't have to disable any module procedure interfaces to get it to compile. However, this wasn't my first idea... my first idea was to disable QuKi and then I'd to #ifdef out a whole bunch of stuff in NWTC_Library and NWTC_IO that provide r16 interfaces. When it started feeling like whack-a-mole, I switched to the hack I posted above.

andrew-platt commented 4 years ago

I'm noticing the same thing. A fair number of the interface routines are specified as SiKi, R8Ki, and QuKi. For those I can simply #ifndef around the QuKi versions in the interface.

However there are several routines defined for ReKi and DbKi which becomes an interface issue when ReKi and DbKi get set the same. Those would need to be split into 3 routines as above.

I'm really curious what QuKi is when you set it above. Is it a 12 byte?

sayerhs commented 4 years ago

The following program outputs 16 on Summit

program test
INTEGER, PARAMETER :: quki = SELECTED_REAL_KIND(20, 120)
real(kind=quki) :: r16

write(*, *) sizeof(r16)

end program

sayerhs commented 4 years ago

And this outputs 16 and 5, so there is no weird padding issues going on

program test
INTEGER, PARAMETER :: quki = SELECTED_REAL_KIND(20, 120)
real(kind=quki) :: r16, a16(5)

write(*, *) sizeof(r16), sizeof(a16) / sizeof(r16)

end program

sayerhs commented 4 years ago

One more test and output

program test
INTEGER, PARAMETER :: quki = SELECTED_REAL_KIND(20, 120)
INTEGER, PARAMETER :: r8ki = selected_real_kind(14, 300)
real(kind=r8ki) :: r8, a8(5)
real(kind=quki) :: r16, a16(5)

write(*, *) sizeof(r8), sizeof(a8) / sizeof(r8)
write(*, *) sizeof(r16), sizeof(a16) / sizeof(r16)

end program
                    8                    5
                   16                    5

I am using gfortran 7.4.0 on Summit by the way.

bjonkman commented 4 years ago

Couple of ideas:

You could replace the SELECTED_REAL_KIND statements with the numbers 4, 8, and 16. I don't know of any modern Fortran compiler that doesn't use the number of bytes in the kind specification. There was one old compiler that used REAL(1) and REAL(2) for 4- and 8- byte numbers, but that was so long ago that it's not relevant anymore. I don't know if the current Fortran standard actually specifies how these are supposed to be set now (I left my copy of the 2003 standard at NREL), but it seems like compilers have settled on the real(4), real(8), and real(16) definitions. That doesn't help for those old compilers that don't support quad precision, though.
I'd be in favor of getting rid of QuKi completely, unless someone actually has a need for that kind of precision. You could set DbKi = R8Ki so that it is 8 bytes regardless of single/double precision flags, and make all the interfaces versions of SiKi and R8Ki only (no ReKi and DbKi versions).

Thoughts?

andrew-platt commented 4 years ago

I really like idea #2. I've started a branch where I was optionally removing the QuKi, but will pivot to completely eliminating them. I might have a PR for review in a few days for this.

On #1, are there potentially some chip architectures that would support a 12 byte (or other 'weird' size) real?

sayerhs commented 4 years ago

What about using real64 and real128 from ISO_FORTRAN_ENV, is that possible within OpenFAST?

$ cat test.F90 && gfortran test.F90 && ./a.out
program test
  use, intrinsic :: iso_fortran_env , only: real32, real64, real128
  real(real32) :: r4
  real(real64) :: r8
  real(real128) :: r16

  write(*,*) real32, real64, real128
  write(*, *) precision(r4), range(r4)
  write(*, *) precision(r8), range(r8)
  write(*, *) precision(r16), range(r16)

end program

           4           8          16
           6          37
          15         307
          31         291

Also @andrew-platt, I am doubtful that any architecture uses a 12-byte real. That would require padding with cache lines which are 64 or 128 wide.

bjonkman commented 4 years ago

Those named constants look like they are part of the Fortran 2008 standard. The OpenFAST documentation says to use Fortran 2003... are there any objections to changing that guidance? People used to complain to me when things didn't work with their very old compiler that they couldn't update. Otherwise, I think that's a nice option.

sayerhs commented 4 years ago

Thanks @bjonkman . I had overlooked about the Fortran2003 requirement for OpenFAST. I tested it on the oldest compiler version available on that machine (GCC 4.8.5) and it does recognize those constants.

$ cat test.F90 && gfortran test.F90 && ./a.out && gfortran --version

program test
  use, intrinsic :: iso_fortran_env , only: real32, real64, real128
  real(real32) :: r4
  real(real64) :: r8
  real(real128) :: r16

  write(*,*) real32, real64, real128
  write(*, *) precision(r4), range(r4)
  write(*, *) precision(r8), range(r8)
  write(*, *) precision(r16), range(r16)

end program

           4           8          16
           6          37
          15         307
          31         291
GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-37)
Copyright (C) 2015 Free Software Foundation, Inc.

sayerhs commented 4 years ago

@bjonkman Also it seems we are already using some Fortran2008 features (e.g., compiler_version in #507), so we are likely safe?

rafmudaf commented 4 years ago

Yes, I inadvertently violated that. No objections to use 2008 standard from my end.

bjonkman commented 4 years ago

As long as you're okay with the workarounds for older compilers (like the issue with compiler_version fixed in https://github.com/OpenFAST/openfast/pull/454, and probably broken again in https://github.com/OpenFAST/openfast/pull/507), I don't have a problem with it.

sayerhs commented 4 years ago

@bjonkman @rafmudaf @andrew-platt Please see #512 for a set of proposed changes that I've implemented for my summit build. This should fix the regression from #507 also.

sayerhs commented 4 years ago

Fixed in #512

OpenFAST / openfast

Compilation error on PowerPC64 due to unsupported real kind #502