OP-TEE / optee_os

Trusted side of the TEE
Other
1.55k stars 1.05k forks source link

Optee PKCS11 TA Performance #5918

Closed embetrix closed 1 year ago

embetrix commented 1 year ago

Hello, I use Optee on a stm32mp157f-dk2 board, version is the 3.16.0-stm32mp and the corresponding BSP :

All my changes are commited and built using a yocto meta-layer : https://github.com/embetrix/meta-stm32mp15x

the Optee build config is described here : https://github.com/embetrix/meta-stm32mp15x/blob/kirkstone/recipes-security/optee/optee-os-stm32mp_3.16.0.bb#L33

I enabled the PKCS11 TA which is by the way not by default enabled and gave it a try:

EC Prime256 Keypair generation:

# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1  --login --usage-sign  --module /usr/lib/libckteec.so.0
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
  label:      testkeyEC
  ID:         01
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; EC  EC_POINT 256 bits
  EC_POINT:   044104d7506303c183c36445ef2d5161a5cfe1effaeb12a7b41ef458bc27811d2ddd915518917cd385ec3572032483a6a2efbeb539f585be9d443754862716fabc609d
  EC_PARAMS:  06082a8648ce3d030107
  label:      testkeyEC
  ID:         01
  Usage:      verify
  Access:     local
real    1m 4.92s
user    0m 0.01s
sys     0m 31.37s

RSA 2048 Keypair generation:

# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2  --login --usage-sign  --module /usr/lib/libckteec.so.0 
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      testkeyRSA
  ID:         02
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
  label:      testkeyRSA
  ID:         02
  Usage:      verify
  Access:     local
real    0m 43.02s
user    0m 0.00s
sys     0m 20.82s

It take way too long for any real world application :-( and strange by the way that ECC prime256 operation take longer than RSA 2048 !

For the sake of comparison I tried with the official Optee build using the https://github.com/OP-TEE/manifest/blob/master/stm32mp1.xml manifest

I got much better times !

EC Prime256 Keypair generation:

# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1  --login --usage-sign  --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
  label:      testkeyEC
  ID:         01
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; EC  EC_POINT 256 bits
  EC_POINT:   0441045172428126d0dd3db11d2aaaaf7f7ad5fb4dddc0ad932f12145c6d42306c5a6212d71d9ab5378400c7bced1d31060b881bac7e6ebf66d88e238327920ec2f477
  EC_PARAMS:  06082a8648ce3d030107
  label:      testkeyEC
  ID:         01
  Usage:      verify
  Access:     local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real    0m 4.14s
user    0m 0.00s
sys     0m 3.96s

RSA 2048 Keypair generation:

# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2 --login --usage-sign  --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      testkeyRSA
  ID:         02
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
  label:      testkeyRSA
  ID:         02
  Usage:      verify
  Access:     local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real    0m 15.59s
user    0m 0.00s
sys     0m 15.43s

I'm stuck with the ST BSP (u-boot, Kernel) at the moment and using new optee 3.20 with that I cannot even bootup the board.

ST latest Optee release is still the 3.16.0-stm32mp, so my question if they are ways to tweak optee and remove bottlenecks to obtain better PKCS11 performance ?

embetrix commented 1 year ago

It looks like someone @johndoe31415 is experimenting similar issues in https://github.com/OP-TEE/optee_os/issues/5915

embetrix commented 1 year ago

with disabling debug and unwind options I get better performance:

CFG_TEE_CORE_DEBUG=n CFG_UNWIND=n

next step is to allocate M4 memory (which my application doesn't require) to OP-TEE:

https://wiki.st.com/stm32mpu/wiki/STM32MP15_RAM_mapping

muttalkadavul commented 11 months ago

@embetrix I was looking to enable the PKCS11 TA, could you tell me or guide me to an article for the steps to enable the PKCS11 TA?