weidai11 / cryptopp

free C++ class library of cryptographic schemes
https://cryptopp.com
Other
4.8k stars 1.49k forks source link

Add Clang 7.0 PowerPC support #742

Closed noloader closed 5 years ago

noloader commented 5 years ago

We were able to build LLVM Clang for PowerPC on GCC112 at the GCC Compile Farm. LLVM Clang behaves differently than the Clang front-end for XLC, and requires non-trivial fixes.

This report will track the changes for LLVM Clang.

noloader commented 5 years ago

Support for Clang is mostly in. I think we got most of the macros worked out to detect all of the compilers. "All of the compilers" is complicated because Clang masquerades as GCC and XLC (in addition to LLVM) but does not consume the intrinsics used by the compilers it pretends to be. The compiler profiles are:

  1. IBM XLC 13.0 and earlier (pure IBM)
  2. IBM XLC 13.1 and above (Clang front-end), without -qxlcompatmacros
  3. IBM XLC 13.1 and above (Clang front-end) with -qxlcompatmacros
  4. LLVM Clang
  5. GNU GCC

Case (2) above is the problem because it is the dominant use case moving forward. Users don't know they need to do something special, like add -qxlcompatmacros, for the common case. The engineering failure is, XLC does not identify itself as XLC. Instead, it identifies itself as Clang and GCC but then fails to consume the XLC or GCC intrinsics.

Case (3) above is comical. The compiler identifies itself as Clang, GCC and XLC all at once. And again, it fails to consume the XLC or GCC intrinsics. Derp, epic failure.


Here's the mess IBM has made of the code:

#if defined(__xlc__) && (__xlc__ < 0x0d01)
# define __early_xlc__ 1
#endif
#if defined(__xlC__) && (__xlC__ < 0x0d01)
# define __early_xlC__ 1
#endif

inline uint32x4_p VecLoad(const byte src[16])
{
#if defined(_ARCH_PWR7)
#  if defined(__early_xlc__) || defined(__early_xlC__)
    return (uint32x4_p)vec_xlw4(0, (byte*)src);
#  elif defined(__xlc__) || defined(__xlC__) || defined(__clang__)
    return (uint32x4_p)vec_xl(0, (byte*)src);
#  else
    return (uint32x4_p)vec_vsx_ld(0, (byte*)src);
#  endif
#else
    return VecLoad_ALTIVEC(src);
#endif
}

And:

template <class T1, class T2>
inline T1 VecEncrypt(const T1 state, const T2 key)
{
#if defined(__ibmxl__) || (defined(_AIX) && defined(__xlC__))
    return (T1)__vcipher((uint8x16_p)state, (uint8x16_p)key);
#elif defined(__clang__)
    return (T1)__builtin_altivec_crypto_vcipher((uint64x2_p)state, (uint64x2_p)key);
#else
    return (T1)__builtin_crypto_vcipher((uint64x2_p)state, (uint64x2_p)key);
#endif
}

For completeness here are the macros from the compilers.

LLVM Clang:

$ $HOME/llvm/bin/clang++ -dM -E TestPrograms/test_cxx.cxx | grep -i -E 'ibm|llvm|clang|xlc|gnuc|crypto'
#define __clang__ 1
#define __clang_major__ 7
#define __clang_minor__ 0
#define __clang_patchlevel__ 0
#define __CLANG_STDINT_H
#define __clang_version__ "7.0.0 (tags/RELEASE_700/final)"
#define __CRYPTO__ 1
#define __GNUC__ 4
#define __GNUC_GNU_INLINE__ 1
#define __GNUC_MINOR__ 2
#define __GNUC_PATCHLEVEL__ 1
#define __GNUC_PREREQ(maj,min) ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#define __GNUC_VA_LIST 1
#define _G_va_list __gnuc_va_list
#define _IO_va_list __gnuc_va_list
#define __llvm__ 1
#define __VERSION__ "4.2.1 Compatible Clang 7.0.0 (tags/RELEASE_700/final)"

IBM XLC 13.1 and above (Clang front-end), without -qxlcompatmacros

$ xlC -dM -E TestPrograms/test_cxx.cxx | grep -i -E 'ibm|llvm|clang|xlc|gnuc|crypto'
#define __clang__ 1
#define __clang_major__ 4
#define __clang_minor__ 0
#define __clang_patchlevel__ 1
#define __clang_version__ "4.0.1 (tags/RELEASE_401/final)"
#define __CRYPTO__ 1
#define __GNUC__ 4
#define __GNUC_GNU_INLINE__ 1
#define __GNUC_MINOR__ 2
#define __GNUC_PATCHLEVEL__ 1
#define __GNUC_PREREQ(maj,min) ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#define __GNUC_VA_LIST 1
#define __IBMCPP_COMPLEX_INIT 1
#define __ibmxl__ 1
#define __ibmxl_modification__ 6
#define __ibmxl_ptf_fix_level__ 1
#define __ibmxl_release__ 1
#define __ibmxl_version__ 13
#define __ibmxl_vrm__ 0x0D010600u
#define __llvm__ 1
#define __VERSION__ "IBM XL C/C++ for Linux, Version 13.1.6.1"
#define __XLC_BUILTIN_VAARG__ 1

IBM XLC 13.1 and above (Clang front-end) with -qxlcompatmacros:

$ xlC -qxlcompatmacros -dM -E TestPrograms/test_cxx.cxx | grep -i -E 'ibm|llvm|clang|xlc|gnuc|crypto'
#define __clang__ 1
#define __clang_major__ 4
#define __clang_minor__ 0
#define __clang_patchlevel__ 1
#define __clang_version__ "4.0.1 (tags/RELEASE_401/final)"
#define __CRYPTO__ 1
#define __GNUC__ 4
#define __GNUC_GNU_INLINE__ 1
#define __GNUC_MINOR__ 2
#define __GNUC_PATCHLEVEL__ 1
#define __GNUC_PREREQ(maj,min) ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#define __GNUC_VA_LIST 1
#define __IBMCPP__ 1316
#define __IBMCPP_COMPLEX_INIT 1
#define __ibmxl__ 1
#define __ibmxl_modification__ 6
#define __ibmxl_ptf_fix_level__ 1
#define __ibmxl_release__ 1
#define __ibmxl_version__ 13
#define __ibmxl_vrm__ 0x0D010600u
#define __llvm__ 1
#define __VERSION__ "IBM XL C/C++ for Linux, Version 13.1.6.1"
#define __xlC__ 0x0d01
#define __XLC_BUILTIN_VAARG__ 1
#define __xlC_ver__ 0x00000601

IBM XLC 13.0 and earlier (pure IBM):

$  xlC -qshowmacros -E TestPrograms/test_cxx.cxx | grep -i -E 'ibm|llvm|clang|xlc|gnuc|crypto'
#define _IBMR2 1
#define __IBMCPP_COMPLEX_INIT 1
#define __IBMCPP_EXTERN_TEMPLATE 1
#define __IBMCPP__ 1210
#define __IBMC_COMPLEX_INIT 1
#define __IBM_ALIGNOF__ 1
#define __IBM_ATTRIBUTES 1
#define __IBM_CHAR16_T__ 1
#define __IBM_CHAR32_T__ 1
#define __IBM_COMPUTED_GOTO 1
#define __IBM_EXTENSION_KEYWORD 1
#define __IBM_GCC_ASM 1
#define __IBM_INCLUDE_NEXT 1
#define __IBM_LABEL_VALUE 1
#define __IBM_LOCAL_LABEL 1
#define __IBM_MACRO_WITH_VA_ARGS 1
#define __IBM_UTF_LITERAL 1
#define __IBM__ALIGN 1
#define __IBM__ALIGNOF__ 1
#define __IBM__RESTRICT__ 1
#define __IBM__TYPEOF__ 1
#define __XLC121__ 1
#define __XLC13__ 1
#define __xlC__ 0x0c01
#define __xlC_ver__ 0x00000000

GNU GCC:

$ g++ -dM -E TestPrograms/test_cxx.cxx | grep -i -E 'ibm|llvm|clang|xlc|gnuc|crypto'
#define __CRYPTO__ 1
#define __GNUC__ 4
#define __GNUC_GNU_INLINE__ 1
#define __GNUC_MINOR__ 8
#define __GNUC_PATCHLEVEL__ 5
#define __GNUC_PREREQ(maj,min) ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#define __GNUC_RH_RELEASE__ 16
#define __GNUC_VA_LIST
noloader commented 5 years ago

The POWER8 gear is partially working ...

SHA256 fails its self test. HMACs and SCrypt fails due to SHA256. I spent about a day trying to find a work around but none has materialized.

AES/GCM also fails its self tests. AES/GCM is cloudier because I can't tell if AES is failing, GCM is failing, or the combo is failing. (The problem appears to be GCM because cryptest.exe tv aes tests non-authenticated encryption modes and is error free).

BLAKE2 is also failing but it is straight C++. I rewrote the BLAKE2b and BLAKE2s classes at Commit a65d55a3fd0b but it did not fix the issue. (BLAKE2b and BLAKE2s needed the rewrite, so this issue was merely a trigger).

The tests fail at -O3, -O2 and -O0.

noloader commented 5 years ago

The Clang cut-in is complete.

The failed self-tests were due to LLVM Issue 39704, Clang 7.0 claims unaligned load with POWER7 vec_xl.

Closing this issue. Special thanks to Nemanja Ivanovic from the LLVM team. He had the 39704 patch ready within a day or so. And even better, it overlays on the 7.0 release tarballs without problems.