Mbed-TLS / mbedtls

An open source, portable, easy to use, readable and flexible TLS library, and reference implementation of the PSA Cryptography API. Releases are on a varying cadence, typically around 3 - 6 months between releases.
https://www.trustedfirmware.org/projects/mbed-tls/
Other
5.51k stars 2.6k forks source link

Code size comparison for MbedTLS librarys built by Arm GCC and Arm Clang #7895

Closed lpy4105 closed 1 year ago

lpy4105 commented 1 year ago

There is a gap in code size between the librarys built by Arm GCC and Arm Clang. We need to find out the gap and understand it.

The toolchain used:

Profile: TF-M Medium

lpy4105 commented 1 year ago

Commit: development(3d0c8255aabf1072547e1187b6e55c8acdb26ad1) Build command:

For library/libmbedcrypto.a, the gaps are: Filename ARM GCC ARM Clang Delta Delta%
TOTALS 67208 71837 4629 6.89%
psa_crypto_driver_wrappers.o 1474 2002 528 35.82%
bignum.o 5893 6417 524 8.89%
psa_crypto.o 16844 17358 514 3.05%
ecp.o 6024 6504 480 7.97%
asn1write.o 1398 1668 270 19.31%
bignum_core.o 2172 2378 206 9.48%
md.o 1018 1210 192 18.86%
asn1parse.o 1076 1240 164 15.24%
ecdh.o 840 984 144 17.14%
bignum_mod_raw.o 648 772 124 19.14%
hmac_drbg.o 820 942 122 14.88%
pk.o 915 1037 122 13.33%
pk_wrap.o 443 559 116 26.19%
ctr_drbg.o 1176 1288 112 9.52%
sha256.o 1164 1276 112 9.62%
bignum_mod.o 928 1028 100 10.78%
ecdsa.o 1648 1748 100 6.07%
constant_time.o 492 590 98 19.92%
psa_crypto_hash.o 480 574 94 19.58%
ccm.o 1524 1612 88 5.77%
entropy.o 652 738 86 13.19%
psa_crypto_slot_management.o 2132 2218 86 4.03%
psa_crypto_aead.o 784 864 80 10.20%
cipher_wrap.o 484 552 68 14.05%
platform_util.o 92 158 66 71.74%
hkdf.o 385 449 64 16.62%
cipher.o 1096 1159 63 5.75%
pkwrite.o 836 892 56 6.70%
psa_crypto_ecp.o 1596 1638 42 2.63%
memory_buffer_alloc.o 748 788 40 5.35%
platform.o 56 94 38 67.86%
psa_util.o 204 236 32 15.69%
psa_crypto_client.o 124 146 22 17.74%
pkparse.o 1188 1208 20 1.68%
psa_crypto_rsa.o 108 124 16 14.81%
psa_crypto_cipher.o 96 108 12 12.50%
psa_crypto_mac.o 1020 1016 -4 -0.39%
ecp_curves.o 1112 1076 -36 -3.24%
error.o 4562 4448 -114 -2.50%
aes.o 2956 2738 -218 -7.37%
lpy4105 commented 1 year ago

The actual gap wouldn't be that big. armclang always generate exception index table entries (.ARM.exidx.xxx for each function, 8 bytes), and size would count those sizes into .text.

It seems we couldn't suspend armclang from generating these entries, I add -funwind-tables option for armgcc to generate these entries so that we could have a "comparable" result. The code size comparison would be:

Note: With -Oz, armclang does function outlining while armgcc doesn't(?). Filename armgcc armclang Delta Delta%
TOTALS 72684 71837 -847 -1.17%
bignum.o 6277 6417 140 2.23%
asn1write.o 1558 1668 110 7.06%
ecdh.o 936 984 48 5.13%
hkdf.o 409 449 40 9.78%
sha256.o 1240 1276 36 2.90%
psa_crypto_hash.o 540 574 34 6.30%
psa_crypto_driver_wrappers.o 1970 2002 32 1.62%
pk_wrap.o 531 559 28 5.27%
hmac_drbg.o 916 942 26 2.84%
entropy.o 716 738 22 3.07%
pkwrite.o 876 892 16 1.83%
asn1parse.o 1228 1240 12 0.98%
ecp.o 6492 6504 12 0.18%
constant_time.o 580 590 10 1.72%
psa_util.o 228 236 8 3.51%
bignum_mod_raw.o 768 772 4 0.52%
cipher_wrap.o 548 552 4 0.73%
psa_crypto_cipher.o 104 108 4 3.85%
platform_util.o 156 158 2 1.28%
ctr_drbg.o 1288 1288 0 0.00%
psa_crypto_rsa.o 124 124 0 0.00%
platform.o 96 94 -2 -2.08%
psa_crypto_client.o 148 146 -2 -1.35%
bignum_mod.o 1032 1028 -4 -0.39%
md.o 1214 1210 -4 -0.33%
psa_crypto_aead.o 872 864 -8 -0.92%
bignum_core.o 2396 2378 -18 -0.75%
psa_crypto_slot_management.o 2236 2218 -18 -0.81%
memory_buffer_alloc.o 812 788 -24 -2.96%
ecdsa.o 1776 1748 -28 -1.58%
psa_crypto_ecp.o 1668 1638 -30 -1.80%
pk.o 1071 1037 -34 -3.17%
pkparse.o 1244 1208 -36 -2.89%
ccm.o 1652 1612 -40 -2.42%
ecp_curves.o 1128 1076 -52 -4.61%
cipher.o 1216 1159 -57 -4.69%
psa_crypto_mac.o 1092 1016 -76 -6.96%
error.o 4586 4448 -138 -3.01%
aes.o 2996 2738 -258 -8.61%
psa_crypto.o 17964 17358 -606 -3.37%
lpy4105 commented 1 year ago

Remove the size of .ARM.exidx.xxx from the size result of armclang, we got:

filename armgcc armclang change change%
(TOTALS) 67208 65717 -1491 -2.22%
asn1write.o 1398 1476 78 5.58%
bignum.o 5893 5961 68 1.15%
ecdh.o 840 888 48 5.71%
sha256.o 1164 1212 48 4.12%
hkdf.o 385 425 40 10.39%
pk_wrap.o 443 471 28 6.32%
entropy.o 652 666 14 2.15%
constant_time.o 492 502 10 2.03%
hmac_drbg.o 820 830 10 1.22%
md.o 1018 1026 8 0.79%
pkwrite.o 836 844 8 0.96%
psa_crypto_driver_wrappers.o 1474 1482 8 0.54%
asn1parse.o 1076 1080 4 0.37%
cipher_wrap.o 484 488 4 0.83%
psa_crypto_cipher.o 96 100 4 4.17%
platform_util.o 92 94 2 2.17%
platform.o 56 54 -2 -3.57%
psa_crypto_rsa.o 108 108 0 0.00%
psa_crypto_client.o 124 122 -2 -1.61%
psa_crypto_aead.o 784 776 -8 -1.02%
bignum_mod.o 928 916 -12 -1.29%
bignum_mod_raw.o 648 636 -12 -1.85%
ctr_drbg.o 1176 1160 -16 -1.36%
psa_crypto_hash.o 480 526 46 9.58%
psa_crypto_slot_management.o 2132 2114 -18 -0.84%
memory_buffer_alloc.o 748 724 -24 -3.21%
bignum_core.o 2172 2138 -34 -1.57%
ecdsa.o 1648 1612 -36 -2.18%
psa_crypto_ecp.o 1596 1558 -38 -2.38%
pk.o 915 885 -30 -3.28%
ecp_curves.o 1112 1060 -52 -4.68%
ccm.o 1524 1468 -56 -3.67%
cipher.o 1096 1039 -57 -5.20%
pkparse.o 1188 1128 -60 -5.05%
psa_util.o 204 212 8 3.92%
psa_crypto_mac.o 1020 944 -76 -7.45%
ecp.o 6024 5920 -104 -1.73%
aes.o 2956 2698 -258 -8.73%
error.o 4562 4424 -138 -3.02%
psa_crypto.o 16844 15950 -894 -5.31%

It seams that armclang -Oz got better code size that arm-none-eabi-gcc -Os (also tried arm-none-eabi-gcc -Oz, but no improvement).