wolfSSL / wolfssl

The wolfSSL library is a small, fast, portable implementation of TLS/SSL for embedded devices to the cloud. wolfSSL supports up to TLS 1.3 and DTLS 1.3!
https://www.wolfssl.com
GNU General Public License v2.0
2.29k stars 818 forks source link

[Bug]: ARMv7a alignment exception in wc_Chacha_Process when compiled with ARMASM #7181

Closed suihkulokki closed 7 months ago

suihkulokki commented 7 months ago

Contact Details

Version

5.6.6

Description

I am evaluating wolfssl for commercial use. Tests have been successful, but low performance. Attempts to enable ARM assembler for the target platform (ARMv7 cortex-a9 i.mx6. Less so. ARM optimized version crashes with an alignment exception from the kernel when running unit tests on target system:

./tests/unit.test
starting unit tests...
 Begin API Tests
     1: test_fileAccess                                     : skipped
     2: test_ForceZero                                      : passed (  0.00063)
     3: test_wolfCrypt_Init                                 : passed (  0.00065)
     4: test_wc_SetMutexCb                                  : passed (  0.00000)
     5: test_wc_LockMutex_ex                                : passed (  0.00001)
     6: test_wc_InitMd5                                     : passed (  0.00023)
     7: test_wc_Md5Update                                   : passed (  0.00024)
     8: test_wc_Md5Final                                    : passed (  0.00001)
     9: test_wc_InitSha                                     : passed (  0.00009)
    10: test_wc_ShaUpdate                                   : passed (  0.00002)
    11: test_wc_ShaFinal                                    : passed (  0.00002)
    12: test_wc_InitSha256                                  : passed (  0.00001)
    13: test_wc_Sha256Update                                : passed (  0.00018)
    14: test_wc_Sha256Final                                 : passed (  0.00002)
    15: test_wc_Sha256FinalRaw                              : skipped
    16: test_wc_Sha256GetFlags                              : passed (  0.00001)
    17: test_wc_Sha256Free                                  : passed (  0.00000)
    18: test_wc_Sha256GetHash                               : passed (  0.00002)
    19: test_wc_Sha256Copy                                  : passed (  0.00016)
    20: test_wc_InitSha224                                  : passed (  0.00001)
    21: test_wc_Sha224Update                                : passed (  0.00010)
    22: test_wc_Sha224Final                                 : passed (  0.00002)
    23: test_wc_Sha224SetFlags                              : passed (  0.00001)
    24: test_wc_Sha224GetFlags                              : passed (  0.00001)
    25: test_wc_Sha224Free                                  : passed (  0.00000)
    26: test_wc_Sha224GetHash                               : passed (  0.00001)
    27: test_wc_Sha224Copy                                  : passed (  0.00001)
    28: test_wc_InitSha512                                  : passed (  0.00001)
    29: test_wc_Sha512Update                                : passed (  0.00027)
    30: test_wc_Sha512Final                                 : passed (  0.00009)
    31: test_wc_Sha512GetFlags                              : passed (  0.00001)
    32: test_wc_Sha512FinalRaw                              : skipped
    33: test_wc_Sha512Free                                  : passed (  0.00000)
    34: test_wc_Sha512GetHash                               : passed (  0.00002)
    35: test_wc_Sha512Copy                                  : passed (  0.00001)
    36: test_wc_InitSha512_224                              : passed (  0.00000)
    37: test_wc_Sha512_224Update                            : passed (  0.00002)
    38: test_wc_Sha512_224Final                             : passed (  0.00003)
    39: test_wc_Sha512_224GetFlags                          : passed (  0.00001)
    40: test_wc_Sha512_224FinalRaw                          : skipped
    41: test_wc_Sha512_224Free                              : passed (  0.00000)
    42: test_wc_Sha512_224GetHash                           : passed (  0.00001)
    43: test_wc_Sha512_224Copy                              : passed (  0.00001)
    44: test_wc_InitSha512_256                              : passed (  0.00000)
    45: test_wc_Sha512_256Update                            : passed (  0.00002)
    46: test_wc_Sha512_256Final                             : passed (  0.00011)
    47: test_wc_Sha512_256GetFlags                          : passed (  0.00001)
    48: test_wc_Sha512_256FinalRaw                          : skipped
    49: test_wc_Sha512_256Free                              : passed (  0.00000)
    50: test_wc_Sha512_256GetHash                           : passed (  0.00001)
    51: test_wc_Sha512_256Copy                              : passed (  0.00001)
    52: test_wc_InitSha384                                  : passed (  0.00000)
    53: test_wc_Sha384Update                                : passed (  0.00002)
    54: test_wc_Sha384Final                                 : passed (  0.00002)
    55: test_wc_Sha384GetFlags                              : passed (  0.00000)
    56: test_wc_Sha384FinalRaw                              : skipped
    57: test_wc_Sha384Free                                  : passed (  0.00000)
    58: test_wc_Sha384GetHash                               : passed (  0.00002)
    59: test_wc_Sha384Copy                                  : passed (  0.00001)
    60: test_wc_InitBlake2b                                 : passed (  0.00046)
    61: test_wc_InitBlake2b_WithKey                         : passed (  0.00008)
    62: test_wc_InitBlake2s_WithKey                         : passed (  0.00001)
    63: test_wc_InitRipeMd                                  : passed (  0.00000)
    64: test_wc_RipeMdUpdate                                : passed (  0.00002)
    65: test_wc_RipeMdFinal                                 : passed (  0.00001)
    66: test_wc_InitSha3                                    : passed (  0.00001)
    67: testing_wc_Sha3_Update                              : passed (  0.00002)
    68: test_wc_Sha3_224_Final                              : passed (  0.00024)
    69: test_wc_Sha3_256_Final                              : passed (  0.00008)
    70: test_wc_Sha3_384_Final                              : passed (  0.00014)
    71: test_wc_Sha3_512_Final                              : passed (  0.00012)
    72: test_wc_Sha3_224_Copy                               : passed (  0.00015)
    73: test_wc_Sha3_256_Copy                               : passed (  0.00022)
    74: test_wc_Sha3_384_Copy                               : passed (  0.00011)
    75: test_wc_Sha3_512_Copy                               : passed (  0.00004)
    76: test_wc_Sha3_GetFlags                               : passed (  0.00001)
    77: test_wc_InitShake256                                : passed (  0.00000)
    78: testing_wc_Shake256_Update                          : passed (  0.00001)
    79: test_wc_Shake256_Final                              : passed (  0.00003)
    80: test_wc_Shake256_Copy                               : passed (  0.00008)
    81: test_wc_Shake256Hash                                : passed (  0.00019)
    82: test_wc_InitSm3Free                                 : skipped
    83: test_wc_Sm3UpdateFinal                              : skipped
    84: test_wc_Sm3GetHash                                  : skipped
    85: test_wc_Sm3Copy                                     : skipped
    86: test_wc_Sm3FinalRaw                                 : skipped
    87: test_wc_Sm3GetSetFlags                              : skipped
    88: test_wc_Sm3Hash                                     : skipped
    89: test_wc_HashInit                                    : passed (  0.00013)
    90: test_wc_HashSetFlags                                : passed (  0.00002)
    91: test_wc_HashGetFlags                                : passed (  0.00001)
    92: test_wc_Md5HmacSetKey                               : passed (  0.00004)
    93: test_wc_Md5HmacUpdate                               : passed (  0.00002)
    94: test_wc_Md5HmacFinal                                : passed (  0.00002)
    95: test_wc_ShaHmacSetKey                               : passed (  0.00005)
    96: test_wc_ShaHmacUpdate                               : passed (  0.00010)
    97: test_wc_ShaHmacFinal                                : passed (  0.00005)
    98: test_wc_Sha224HmacSetKey                            : passed (  0.00004)
    99: test_wc_Sha224HmacUpdate                            : passed (  0.00002)
   100: test_wc_Sha224HmacFinal                             : passed (  0.00003)
   101: test_wc_Sha256HmacSetKey                            : passed (  0.00004)
   102: test_wc_Sha256HmacUpdate                            : passed (  0.00004)
   103: test_wc_Sha256HmacFinal                             : passed (  0.00003)
   104: test_wc_Sha384HmacSetKey                            : passed (  0.00004)
   105: test_wc_Sha384HmacUpdate                            : passed (  0.00003)
   106: test_wc_Sha384HmacFinal                             : passed (  0.00006)
   107: test_wc_InitCmac                                    : passed (  0.00003)
   108: test_wc_CmacUpdate                                  : passed (  0.00001)
   109: test_wc_CmacFinal                                   : passed (  0.00003)
   110: test_wc_AesCmacGenerate                             : passed (  0.00021)
   111: test_wc_AesGcmStream                                : skipped
   112: test_wc_Des3_SetIV                                  : passed (  0.00026)
   113: test_wc_Des3_SetKey                                 : passed (  0.00052)
   114: test_wc_Des3_CbcEncryptDecrypt                      : passed (  0.00035)
   115: test_wc_Des3_CbcEncryptDecryptWithKey               : passed (  0.00138)
   116: test_wc_Des3_EcbEncrypt                             : passed (  0.00019)
   117: test_wc_Chacha_SetKey                               : passed (  0.00001)
   118: test_wc_Chacha_Process                              :Bus error (core dumped)
# 
[ 5749.689750] Alignment trap: not handling instruction ec928b04 at [<b6d6ce2c>]
[ 5749.697267] 8<--- cut here ---
[ 5749.700433] Unhandled fault: alignment exception (0x001) at 0x00799a3b
[ 5749.707189] [00799a3b] *pgd=1901f831, *pte=22ff859f, *ppte=22ff8e7e
#

Reproduction steps

  1. buildroot enviroment with gcc 10.3.0 toolchain,
  2. Modify configure.ac to select armv7a rather than arm8-32bit by default
  3. ./configure --target=arm-buildroot-linux-gnueabihf --host=arm-buildroot-linux-gnueabihf --build=x86_64-pc-linux-gnu --prefix=/usr --exec-prefix=/usr --sysconfdir=/etc --localstatedir=/var --program-prefix="" --disable-dependency-tracking --enable-ipv6 --disable-static --enable-shared --enable-distro --enable-ecc --enable-distro --enable-opensslextra --enable-examples --disable-sslv3 --enable-asm --enable-armasm
  4. Run unit.test on target platform (i.mx6)

Relevant log output

Program received signal SIGBUS, Bus error.

0xb6d71e30 in wc_Chacha_encrypt_64 (over=0xbefc3938 "E@\360Z\237\037\262\226\327sn{ \216<\226\353O\341\203F\210\322`OE\tR\355C-A\273\342\240\266\352uf\322\245\321\347\342\rB\257,S\327\222\261\304?\352\201~\232\322u\256Ticexpand 32-byte k", bytes=26, c=0xbefc3a1c "", m=0x6d9a3b "Everybody gets Friday off.", input=0x10001) at wolfcrypt/src/port/arm/armv8-chacha.c:2227
2227        __asm__ __volatile__ (
(gdb) 
(gdb) bt
#0  0xb6d71e30 in wc_Chacha_encrypt_64 (
    over=0xbefc3938 "E@\360Z\237\037\262\226\327sn{ \216<\226\353O\341\203F\210\322`OE\tR\355C-A\273\342\240\266\352uf\322\245\321\347\342\rB\257,S\327\222\261\304?\352\201~\232\322u\256Ticexpand 32-byte k", bytes=26, c=0xbefc3a1c "", m=0x6d9a3b "Everybody gets Friday off.", input=0x10001) at wolfcrypt/src/port/arm/armv8-chacha.c:2227
#1  wc_Chacha_encrypt_bytes (bytes=26, c=0xbefc3a1c "", m=0x6d9a3b "Everybody gets Friday off.", ctx=<optimized out>) at wolfcrypt/src/port/arm/armv8-chacha.c:2855
#2  wc_Chacha_Process (ctx=0x10001, ctx@entry=0xbefc38ec, output=<optimized out>, output@entry=0xbefc3a14 "", input=<optimized out>, 
    input@entry=0x6d9a3b "Everybody gets Friday off.", msglen=<optimized out>, msglen@entry=26) at wolfcrypt/src/port/arm/armv8-chacha.c:2893
#3  0x00631da4 in test_wc_Chacha_Process () at tests/api.c:17311
#4  0x00673ffc in ApiTest () at tests/api.c:70370
#5  0x00429248 in unit_test (argc=1, argv=0xbefc3d24) at tests/unit.c:219
#6  0xb6b36730 in ?? () from /lib/libc.so.6
#7  0xb6b36838 in __libc_start_main () from /lib/libc.so.6
#8  0x00428e4c in _start () at ../sysdeps/arm/start.S:105
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) disas $pc
...
   0xb6d71e20 <+2484>:  beq 0xb6d71eac <wc_Chacha_Process+2624>
   0xb6d71e24 <+2488>:  cmp r6, #16
   0xb6d71e28 <+2492>:  blt 0xb6d71e4c <wc_Chacha_Process+2528>
   0xb6d71e2c <+2496>:  vldmia  r2, {d8-d9}
=> 0xb6d71e30 <+2500>:  add r2, r2, #16
   0xb6d71e34 <+2504>:  veor    q4, q4, q0
   0xb6d71e38 <+2508>:  vstmia  r1, {d8-d9}
   0xb6d71e3c <+2512>:  add r1, r1, #16
   0xb6d71e40 <+2516>:  subs    r6, r6, #16
   0xb6d71e44 <+2520>:  vorr    q0, q1, q1
---
SparkiDev commented 7 months ago

Hi @suihkulokki

I've put up a pull request that fixes the inline assembly code for ChaCha20: #7182

Let us know if this fixes your issue.

Also if you see any more of these issue, please let us know and we will fix them as well.

Thanks, Sean

-- Sean Parkinson, wolfSSL Senior Software Engineer

suihkulokki commented 7 months ago

Hi @SparkiDev

Thank you for your quick response. I can verify with that patch I no longer get unaligned access errors when compiled with arm assembler.

SparkiDev commented 7 months ago

Hi @suihkulokki

Glad we could fix this quickly for you! The pull request has been merged and the fix will go out in the next release.

Please raise a new issue if you have anything more like this.

Sean

20083017 commented 2 weeks ago

@SparkiDev I have a question here aes-gcm-256 assembly code is not available on armv7,is this a compiling problem? image image

SparkiDev commented 2 weeks ago

Hi @20083017,

When compiling for ARM, you need to enable the assembly: --enable-armasm. If the host is armv7l, the NEON code will be compiled and the AES-GCM assembly will be compiled in.

Most versions of ARM CPUs don't have support for SHA-512/SHA-3/SM3/SM4 hardware crypto instructions and therefore they are called out explicitly in the configuration output and required to be explicitly requested.

Sean