OP-TEE / optee_os

Trusted side of the TEE
Other
1.56k stars 1.05k forks source link

TA splat on reboot with fTPM #6094

Closed jan-kiszka closed 1 year ago

jan-kiszka commented 1 year ago

We are working on a full UEFI secure boot / RPMB secure storage integration on an AM65x (https://github.com/siemens/meta-iot2050), and one piece of the puzzle is an fTPM (https://github.com/Microsoft/ms-tpm-20-ref). Things are basically working, but this here is still at least annoying:

I/TC: 
I/TC: OP-TEE version: 3.21.0-176-g145953d55 (gcc version 12.2.1 20221205 (Arm GNU Toolchain 12.2.Rel1 (Build arm-12.24))) #5 Wed Jun  7 05:45:04 UTC 2023 aarch64
I/TC: Primary CPU initializing
I/TC: SYSFW ABI: 3.1 (firmware rev 0x0015 '21.9.1--v2021.09a (Terrific Lla')
I/TC: HUK Initialized
I/TC: Activated SA2UL device
I/TC: Enabled firewalls for SA2UL TRNG device
I/TC: SA2UL TRNG initialized
I/TC: SA2UL Drivers initialized
I/TC: Primary CPU switching to normal world boot
[...]
root@iot2050-debian:~# reboot
[...]
[   73.338538] systemd-shutdown[1]: Syncing filesystems and block devices.
[   73.345647] systemd-shutdown[1]: Rebooting.
E/TC:? 0 get_rpc_alloc_res:645 RPC allocation failed. Non-secure world result: ret=0xffff000c ret_origin=0x2
E/TC:? 0 get_rpc_alloc_res:645 RPC allocation failed. Non-secure world result: ret=0xffff000c ret_origin=0x2
E/TC:? 0 
E/TC:? 0 TA panicked with code 0xffff000c
E/LD:  Status of TA bc50d971-d4c9-42c4-82cb-343fb7f37896
E/LD:   arch: aarch64
E/LD:  region  0: va 0x40004000 pa 0x9ea00000 size 0x002000 flags rw-s (ldelf)
E/LD:  region  1: va 0x40006000 pa 0x9ea02000 size 0x008000 flags r-xs (ldelf)
E/LD:  region  2: va 0x4000e000 pa 0x9ea0a000 size 0x001000 flags rw-s (ldelf)
E/LD:  region  3: va 0x4000f000 pa 0x9ea0b000 size 0x004000 flags rw-s (ldelf)
E/LD:  region  4: va 0x40013000 pa 0x9ea0f000 size 0x001000 flags r--s
E/LD:  region  5: va 0x40014000 pa 0x9ea96000 size 0x011000 flags rw-s (stack)
E/LD:  region  6: va 0x40025000 pa 0x8672c000 size 0x002000 flags rw-- (param)
E/LD:  region  7: va 0x40043000 pa 0x00001000 size 0x067000 flags r-xs [0]
E/LD:  region  8: va 0x400aa000 pa 0x00068000 size 0x01f000 flags rw-s [0]
E/LD:   [0] bc50d971-d4c9-42c4-82cb-343fb7f37896 @ 0x40043000
E/LD:  Call stack:
E/LD:   0x40080f50
E/LD:   0x40043bb4
E/LD:   0x40044238
E/LD:   0x40062d5c
E/LD:   0x4007c014
E/LD:   0x40044ae4
E/LD:   0x4008709c
E/LD:   0x4007c1c4
[   73.466791] tpm tpm0: ftpm_tee_tpm_op_send: SUBMIT_COMMAND invoke error: 0xffff3024
[   73.474497] ------------[ cut here ]------------
[   73.479119] WARNING: CPU: 1 PID: 1 at drivers/char/tpm/tpm_ftpm_tee.c:135 ftpm_tee_tpm_op_send+0x200/0x25c
[   73.488750] Modules linked in: ctr ccm mt7601u mac80211 cfg80211 rfkill libarc4 cp210x usbserial ti_k3_r5_remoteproc ti_am335x_adc kfifo_buf pci_endpoint_test ti_cal videobuf2_dma_contig irq_pruss_intc v4l2_fwnode videobuf2_memops videobuf2_v4l2 videobuf2_common at24 optee_rng rng_core fuse ip_tables x_tables ipv6 icssg_prueth pru_rproc icss_iep ptp pps_core ti_am335x_tscadc pruss
[   73.522506] CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.10.162-cip24 #1
[   73.529624] Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2023.07-rc3-00018-g0afdaac6505 07/01/2023
[   73.539952] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   73.545946] pc : ftpm_tee_tpm_op_send+0x200/0x25c
[   73.550640] lr : ftpm_tee_tpm_op_send+0x200/0x25c
[   73.555331] sp : ffff80001129baa0
[   73.558635] x29: ffff80001129baa0 x28: ffff00000646f000 
[   73.563938] x27: ffff8000110f7000 x26: 0000000000000016 
[   73.569241] x25: 0000000000000145 x24: ffff000005395000 
[   73.574544] x23: ffff0000065a7280 x22: ffff00000646f000 
[   73.579847] x21: ffff000006422080 x20: 000000000000000c 
[   73.585149] x19: 0000000000000000 x18: 0000000000000000 
[   73.590450] x17: 0000000000000000 x16: 0000000000000000 
[   73.595753] x15: 0000000000000030 x14: ffffffffffffffff 
[   73.601055] x13: ffff80001110e838 x12: 00000000000006d2 
[   73.606357] x11: 0000000000000246 x10: ffff800011166838 
[   73.611659] x9 : 00000000fffff000 x8 : ffff80001110e838 
[   73.616962] x7 : ffff800011166838 x6 : 0000000000000000 
[   73.622263] x5 : 0000000000000000 x4 : 0000000000000000 
[   73.627565] x3 : 0000000000000000 x2 : 0000000000000000 
[   73.632867] x1 : 0000000000000000 x0 : ffff0000000e8000 
[   73.638170] Call trace:
[   73.640610]  ftpm_tee_tpm_op_send+0x200/0x25c
[   73.644960]  tpm_transmit+0xc8/0x33c
[   73.648528]  tpm_transmit_cmd+0x30/0xc0
[   73.652353]  tpm2_shutdown+0xa4/0x100
[   73.656007]  tpm_class_shutdown+0x60/0x90
[   73.660009]  device_shutdown+0x138/0x330
[   73.663926]  __do_sys_reboot+0x218/0x2a0
[   73.667839]  __arm64_sys_reboot+0x24/0x30
[   73.671842]  el0_svc_common.constprop.0+0x78/0x1c4
[   73.676622]  do_el0_svc+0x24/0x8c
[   73.679932]  el0_svc+0x14/0x20
[   73.682978]  el0_sync_handler+0xb0/0xb4
[   73.686806]  el0_sync+0x180/0x1c0
[   73.690111] ---[ end trace 73b86edbd18826c6 ]---
[   73.694755] tpm tpm0: tpm_try_transmit: send(): error -53212
[   73.701331] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   73.740602] reboot: Restarting system

(I've added a WARN_ON to ftpm_tee_tpm_op_send's error path)

Any ideas? Where to debug this, in optee-os or rather the fTPM TA? I don't think the kernel is to blame (I've already checked the ftpm commits since 5.10 for anything suspicious).

jforissier commented 1 year ago

Hi @jan-kiszka,

It looks like the kernel is trying to do some TPM operation but tee-supplicant is down already (and tee-supplicant is needed by OP-TEE to access RPMB). We've encountered this on RockPi4 already. It was addressed by making sure the ftpm driver is unloaded before the supplicant is stopped, see https://gitlab.com/Linaro/trustedsubstrate/meta-ledge-secure/-/blob/kirkstone/meta-ledge-secure/recipes-security/optee/optee-client_%25.bbappend#L14. Ugly, but works.

A much better option would be to remove the dependency on tee-supplicant for RPMB access (i.e., have the optee driver directly call kernel APIs to perform RPMB). See this discussion: https://lore.kernel.org/all/CAC_iWjLOhUvp5ggCCkHN5MRNfB_h6FZ2Z14yrtR3aqGn0Ovxig@mail.gmail.com/.

HTH.

jan-kiszka commented 1 year ago

OK, getting tee-supplicant daemon out of the way would also resolve the startup issues or the need to push it into initrd (see my post to LKML from yesterday).

We are currently compiling fTPM into the kernel, maybe worth to try avoiding that for now until that other issue is resolved.

jan-kiszka commented 1 year ago

Oh, follow-up question: Why are we getting those TA panics, rather than a normal error return to normal world? I'm also seeing that with avb triggered from u-boot (optee_rpmb read_pvalue somevalue 0) - also ugly.

jforissier commented 1 year ago

Oh, follow-up question: Why are we getting those TA panics, rather than a normal error return to normal world? I'm also seeing that with avb triggered from u-boot (optee_rpmb read_pvalue somevalue 0) - also ugly.

Because the TA receives TEE_ERROR_OUT_OF_MEMORY during a secure storage operation (RPC buffers that are normally allocated by tee-supplicant). I don't know if the panic is called by the TA itself or in the GP API though (libutee). Could you symbolize the TA call stack? optee_os/scripts/symbolize.py -d path/to/TA/elf (the path where bc50d971-d4c9-42c4-82cb-343fb7f37896.elf can be found) then paste the E/... lines.

jan-kiszka commented 1 year ago

Haven't done that for the fTPM yet (looking for the build folder...) but for the invalid call (pvalue size 0) to the in-tree avb:

E/TC:? 0 
E/TC:? 0 TA panicked with code 0xffff0001 (TEE_ERROR_ACCESS_DENIED)
E/LD:  Status of TA 023f8f1a-292a-432b-8fc4-de8471358067
E/LD:   arch: aarch64
E/LD:  region  0: va 0x40004000 pa 0x9ea00000 size 0x002000 flags rw-s (ldelf)
E/LD:  region  1: va 0x40006000 pa 0x9ea02000 size 0x008000 flags r-xs (ldelf)
E/LD:  region  2: va 0x4000e000 pa 0x9ea0a000 size 0x001000 flags rw-s (ldelf)
E/LD:  region  3: va 0x4000f000 pa 0x9ea0b000 size 0x004000 flags rw-s (ldelf)
E/LD:  region  4: va 0x40013000 pa 0x9ea0f000 size 0x001000 flags r--s
E/LD:  region  5: va 0x40014000 pa 0x9ea2c000 size 0x005000 flags rw-s (stack)
E/LD:  region  6: va 0x40019000 pa 0xfdf10cb0 size 0x001000 flags rw-- (param)
E/LD:  region  7: va 0x4001a000 pa 0xfdf10c50 size 0x001000 flags rw-- (param)
E/LD:  region  8: va 0x40070000 pa 0x00001000 size 0x014000 flags r-xs [0] .ta_head .text .eh_frame .rodata .gnu.hash .dynsym .dynstr .hash .rela.dyn
E/LD:  region  9: va 0x40084000 pa 0x00015000 size 0x008000 flags rw-s [0] .dynamic .got .rela.got .data .bss
E/LD:   [0] 023f8f1a-292a-432b-8fc4-de8471358067 @ 0x40070000 (out/arm-plat-k3/ta/avb/023f8f1a-292a-432b-8fc4-de8471358067.elf)
E/LD:  Call stack:
E/LD:   0x400743e8 TEE_ReadObjectData at lib/libutee/tee_api_objects.c:915
E/LD:   0x400705f0 read_persist_value at ta/avb/entry.c:343
E/LD:   0x40075260 entry_invoke_command at lib/libutee/user_ta_entry.c:382
E/LD:   0x400707c8 __ta_entry at out/arm-plat-k3/export-ta_arm64/src/user_ta_header.c:70

Yep:

TEE_Result TEE_ReadObjectData(TEE_ObjectHandle object, void *buffer,
                  size_t size, size_t *count)
{
[...]
out:
    if (res != TEE_SUCCESS &&
        res != TEE_ERROR_CORRUPT_OBJECT &&
        res != TEE_ERROR_STORAGE_NOT_AVAILABLE)
        TEE_Panic(res);

Well...

jan-kiszka commented 1 year ago

OK, applying that rmmod workaround to tee-supplicant.service of Debian works around the original issue here as well. Thanks for that, but I hope we have a better solution until Debian 13 is out :wink:.

github-actions[bot] commented 1 year ago

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

jan-kiszka commented 1 year ago

Issue is understood now, workarounds exist, but the proper solution is moving tee-supplicant into the kernel.