OP-TEE / optee_os

Trusted side of the TEE
Other
1.55k stars 1.05k forks source link

xtest pkcs11_1007 sporadic failure #6952

Open etienne-lms opened 1 month ago

etienne-lms commented 1 month ago

xtest pkcs11_1007 (requires to build with CFG_PKCS11_TA=y) sporadically fails with unexpected error code CKR_HOST_MEMORY on various platforms, including qemu_virt, qeu_armv8a, stm32mp1 (for I one i've tests). References found in Github optee_os forum: https://github.com/OP-TEE/optee_os/pull/5395#issuecomment-1178928816 (July 2022, maybe unrelated) https://github.com/OP-TEE/optee_os/pull/5359#issuecomment-1549303081 (May 2023) https://github.com/OP-TEE/optee_os/pull/6125#issuecomment-1610985000 (June 2023) https://github.com/OP-TEE/optee_os/pull/6922#issuecomment-2196445913 (June 2024) https://github.com/OP-TEE/optee_os/pull/6881#issuecomment-2200182317 (July 2024)

I open this Issue to track it. I've started to look closer at this and will post my current findings (that are still weak).

etienne-lms commented 1 month ago

I was able to reproduce the issue. Traces indeed seen show a "host memory" allocation failure, while we would expect that this test exhausts the "device memory" (aka OP-TEE core or TA memory) .

.../out-br/build/optee_test_ext-1.0/host/xtest/pkcs11_1000.c:1792: rv has an unexpected value: 0x2 = CKR_HOST_MEMORY, expected 0xb3 = CKR_SESSION_HANDLE_INVALID
.../out-br/build/optee_test_ext-1.0/host/xtest/pkcs11_1000.c:1794: rv has an unexpected value: 0x2 = CKR_HOST_MEMORY, expected 0xb3 = CKR_SESSION_HANDLE_INVALID
.../out-br/build/optee_test_ext-1.0/host/xtest/pkcs11_1000.c:1796: rv has an unexpected value: 0x2 = CKR_HOST_MEMORY, expected 0xb3 = CKR_SESSION_HANDLE_INVALID
...

They relate to _pkcs111000.c source file tests:

line 1769:  for (n = 0; n < ARRAY_SIZE(sessions); n++) {
            ...
line 1783   }
        ...

line 1791:  /* Closing session with out bound and invalid IDs (or negative ID) */
line 1792:  rv = C_CloseSession(sessions[n - 1] + 1024);
line 1793:  ADBG_EXPECT_CK_RESULT(c, CKR_SESSION_HANDLE_INVALID, rv);
line 1794:  rv = C_CloseSession(CK_INVALID_HANDLE);
line 1795:  ADBG_EXPECT_CK_RESULT(c, CKR_SESSION_HANDLE_INVALID, rv);
line 1796:  rv = C_CloseSession(~0);
line 1797:  ADBG_EXPECT_CK_RESULT(c, CKR_SESSION_HANDLE_INVALID, rv);

I saw also some unexpected error traces from OP-TEE core that are emitted in the loop (that attempts to open more than a hundred sessions) before line 1792 above is called, so I think the issue is deeper than a libckteec (PKCS#11 crypoki client lib) issue:

E/TC:1 0 std_entry_with_parg:234 Bad arg address 0xc2451000
E/TC:1 0 std_entry_with_parg:234 Bad arg address 0xc266c000

When the issue occurs, that loop stops on a "host memory" error, whereas when pkcs11_1007 test succeeds (at it most often does), the loop stops on a "device memory" error, as expected.

I'll continue to investigate. I think they are some Linux OS resources that are not properly released in xtest.

jenswi-linaro commented 1 month ago

E/TC:1 0 std_entry_with_parg:234 Bad arg address 0xc2451000

This error suggests that the normal world uses a non-secure address that OP-TEE isn't aware of.

etienne-lms commented 2 weeks ago

E/TC:1 0 std_entry_with_parg:234 Bad arg address 0xc2451000

This error suggests that the normal world uses a non-secure address that OP-TEE isn't aware of.

I looks like client is invoking OP-TEE with an shm buffer that failed to be registered, but I can't find where this can happen.

Regarding pkcs11_1007 sporadic failures, i think it comes from that fact we try to invoke OP-TEE for closing invalid session handlers (lines 1792 to 1797 shown in the comment above) while OP-TEE core heap memory is already exhausted. I'll propose a change to release some resources before we use make these attempts.