OP-TEE / optee_os

Trusted side of the TEE
Other
1.6k stars 1.07k forks source link

Booting optee-os on rk3399 #4542

Closed ydirson closed 2 years ago

ydirson commented 3 years ago

I am trying to get op-tee 3.12 to run on a nanopi-m4 rk3399 board. So far:

Is there any particular reason not to activate the serial console on rk3399 ? Is something missing for this to work ?

ydirson commented 3 years ago

Finally solved the serial-console issue: not only it needs CFG_EARLY_CONSOLE* as listed above, but u-boot itself must be configured with CONFIG_LOG=y (not sure what impact we're supposed to have here).

So at least OP-TEE is telling things before it stops:

[20210408-160756.524438] INFO:    BL31: Initializing BL32
[20210408-160756.524448] D/TC:0   get_aslr_seed:1360 Bad fdt: -9
[20210408-160756.524451] D/TC:0   add_phys_mem:557 TEE_SHMEM_START type NSEC_SHM 0x32000000 size 0x00400000
[20210408-160756.524453] D/TC:0   add_phys_mem:557 TA_RAM_START type TA_RAM 0x30200000 size 0x01e00000
[20210408-160756.524455] D/TC:0   add_phys_mem:557 VCORE_UNPG_RW_PA type TEE_RAM_RW 0x3005f000 size 0x001a1000
[20210408-160756.529154] D/TC:0   add_phys_mem:557 VCORE_UNPG_RX_PA type TEE_RAM_RX 0x30000000 size 0x0005f000
[20210408-160756.529189] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x06E00000), CORE_MMU_PGDIR_SIZE) type IO_SEC 0xfee00000 size 0x00200000
[20210408-160756.529196] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x071A0000), CORE_MMU_PGDIR_SIZE) type IO_NSEC 0xff000000 size 0x00200000
[20210408-160756.529202] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x07330000), CORE_MMU_PGDIR_SIZE) type IO_SEC 0xff200000 size 0x00200000
[20210408-160756.529208] D/TC:0   add_va_space:597 type RES_VASPACE size 0x00a00000
[20210408-160756.534077] D/TC:0   add_va_space:597 type SHM_VASPACE size 0x02000000
[20210408-160756.534112] D/TC:0   dump_mmap_table:704 type IO_SEC       va 0x2a800000..0x2a9fffff pa 0xff200000..0xff3fffff size 0x00200000 (pgdir)
[20210408-160756.534118] D/TC:0   dump_mmap_table:704 type IO_NSEC      va 0x2aa00000..0x2abfffff pa 0xff000000..0xff1fffff size 0x00200000 (pgdir)
[20210408-160756.534125] D/TC:0   dump_mmap_table:704 type IO_SEC       va 0x2ac00000..0x2adfffff pa 0xfee00000..0xfeffffff size 0x00200000 (pgdir)
[20210408-160756.534131] D/TC:0   dump_mmap_table:704 type NSEC_SHM     va 0x2ae00000..0x2b1fffff pa 0x32000000..0x323fffff size 0x00400000 (pgdir)
[20210408-160756.538260] D/TC:0   dump_mmap_table:704 type TA_RAM       va 0x2b200000..0x2cffffff pa 0x30200000..0x31ffffff size 0x01e00000 (pgdir)
[20210408-160756.538282] D/TC:0   dump_mmap_table:704 type RES_VASPACE  va 0x2d200000..0x2dbfffff pa 0x00000000..0x009fffff size 0x00a00000 (pgdir)
[20210408-160756.538285] D/TC:0   dump_mmap_table:704 type SHM_VASPACE  va 0x2de00000..0x2fdfffff pa 0x00000000..0x01ffffff size 0x02000000 (pgdir)
[20210408-160756.538288] D/TC:0   dump_mmap_table:704 type TEE_RAM_RX   va 0x30000000..0x3005efff pa 0x30000000..0x3005efff size 0x0005f000 (smallpg)
[20210408-160756.557795] D/TC:0   dump_mmap_table:704 type TEE_RAM_RW   va 0x3005f000..0x301fffff pa 0x3005f000..0x301fffff size 0x001a1000 (smallpg)
[20210408-160756.557817] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 1 / 5
[20210408-160756.557820] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 2 / 5
jforissier commented 3 years ago

not only it needs CFG_EARLY_CONSOLE* as listed above,

Regarding this particular point, feel free to make a pull request so that this platform gets a working console by default.

ydirson commented 3 years ago

After looking at where the last lines come from and what's coming next, I disabled CFG_CORE_ASLR, and op-tee now goes a bit further, but hits a new problem:

[20210408-174215.345641] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 1 / 5
[20210408-174215.345666] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 2 / 5
[20210408-174215.353331] I/TC:
[20210408-174215.353337] I/TC: No non-secure external DT
[20210408-174215.353339] D/TC:0 0 get_console_node_from_dt:70 No console directive from DTB
[20210408-174215.353341] I/TC: OP-TEE version: 3.12.0-dev (gcc version 10.2.0 (GCC)) #1 Wed 20 Jan 2021 05:48:48 PM UTC aarch64
[20210408-174215.353344] I/TC: Primary CPU initializing
[20210408-174215.353346] D/TC:0 0 call_initcalls:21 level 1 register_time_source()
[20210408-174215.353349] D/TC:0 0 call_initcalls:21 level 1 teecore_init_pub_ram()
[20210408-174215.357225] D/TC:0 0 call_initcalls:21 level 3 check_ta_store()
[20210408-174215.357248] D/TC:0 0 check_ta_store:418 TA store: "Secure Storage TA"
[20210408-174215.357253] D/TC:0 0 check_ta_store:418 TA store: "REE"
[20210408-174215.357257] D/TC:0 0 call_initcalls:21 level 3 init_user_ta()
[20210408-174215.357261] D/TC:0 0 call_initcalls:21 level 3 verify_pseudo_tas_conformance()
[20210408-174215.357265] D/TC:0 0 call_initcalls:21 level 3 mobj_mapped_shm_init()
[20210408-174215.357269] D/TC:0 0 mobj_mapped_shm_init:433 Shared memory address range: 2de00000, 2fe00000
[20210408-174215.357274] D/TC:0 0 call_initcalls:21 level 3 platform_init()
[20210408-174215.357277] D/TC:0 0 platform_secure_ddr_region:35 protecting region 1: 0x30000000-0x32000000
[20210408-174215.361187] D/TC:0 0 call_initcalls:21 level 3 show_conf()
[20210408-174215.361218] I/TC: Contents of conf.mk (decode with 'base64 -d | xz -d'):
[20210408-174215.361223] /Td6WFoAAATm1rRGAgAhARYAAAB0L+Wj4BCiBe9dACCUhGjjD2/tNLPVbxgCekUJxzz3aaN8r/uaWE8UCFHGWfxzA3HwCh0eE25M
[20210408-174215.361229] pFDBtt3JvHalLU4vEZsrcf8VpjphvhzDHjhafVWHwYlWSEqUKyamj/q85f8NeiAVwycWe7vRQHfGsv/sYnI+XKy3x4Lp4GR/MTkT
[20210408-174215.361235] ISw+QZ0BxNxJHmisT5/7roqP1FUJOmY1wTcM/hE4U9XCy8IfLTlbAZ7Mcg2v2vIgie0oCQHsknjB3UClJgW82WJ8E4JiDAssDjp2
[20210408-174215.361241] jtamCjAD4WWnKFtQHrQ77v1oV5ai2++dW0Ai7CRel0tfryuoV6/WWxHd1Rj1Goqc2Qn9FEKZNPB5PewcnMZCojUKFPzgFtUfs+bk
[20210408-174215.364990] DYY2BuokFYc0ZwB7uoJsuLxnkRgThZ4f6TKRwIL63s2fQWlmlsSW/axBxNRGKr5VDkjb1Szg6MmTTcJQCSAbSWHhr5pruLCuLMRm
[20210408-174215.365010] yF2frDNwIdUqN7raRwJHw1M11z2Gmx4BIQvASJxVo0a3FEs4JMXRWksAVqaGUjIVl4i3vS/uAQ7W8BCXuCp67NSvRPluxJ7cxbd7
[20210408-174215.365013] TTAHYhGoQowxT2AUlSL5hImVFyAUw7BaeY4JdlSKDu/nXcHd9+xnaWKHJWSArzr+J8Jf9+ywhDVowsl5/2Mhs1MAX16j19S+hF9p
[20210408-174215.365015] RrJccwf4bXNavE6nMiojwWNnDt4WgxRYKIHmvdLuViRR7fapoRjP6ajR5C5dD0KHvwet4tq4fWJohU1Jzd/fjygSHKyFKXgXX/6C
[20210408-174215.369006] Jai7QvU7/P+O4FbtQtcPkB7SHZUJuSieEmhlcselLY2ExbbgViAd5gB944V5cibj5R6zBT3vx/EVPrZDneLhmwJoNKEsYmPKOCxI
[20210408-174215.369045] ntVGxVsfXdffwVHJMqKb66jcv7Xzmv7hzAnCNfvVh2JMqT5nCyMLogct9oMuj2rVlRrEsIsN0h4YOKGHokYJHMgcOBWhqA1vfdSq
[20210408-174215.369053] nwq5rdurt5munJwNhylYQffjIQdbTEPFtdInhOC+CmVKfckPIvaIKpxEmN5j0i8ZQ03kVXKMAmstfD72yosHAHYLtuMbd4Sf7vhc
[20210408-174215.369061] Jf6JtuR0dm3sW6fqDfjbsDLUzJTaosGi9dr4wvbx9cZEkDJ06ldZX5vNhJZF0DnsFKx5NxNKXwlV5kTYx6Xh7arBddJ8BIG92inx
[20210408-174215.369069] 7kwgHQlJowi7TDuJGqxDcgpb0+SFfBGlzVBQTEC/9y3+tyiCGatyF8mfnfVR9Q3PSEbuB8GxsCP4IApACQZR6hC5q2ovK2GmNEFV
[20210408-174215.372760] k63HZsQMMgrIeGxMATxIo3OySx+u/ecqaqijvW+J2l2X1I2gkNKG7N15Zd1w4xzteoXhRnCpWCwKWk0Q7nQsACipUR0drQY+epn1
[20210408-174215.372797] 5nyeAfHSMY3vHCpj9fcFo5WKdLfGioiuQf2DauazIJ87nnn20qahD4YD4R7nmSKtGeF73n1Nsd2FLETc4ltlDycR00ONhzhwjfl0
[20210408-174215.372807] x3useMOmgfs/++zrZVLzTmFUsDf/Wd8rbt2shSUkwyo+lbRF+Ih27+RJgrDQJofgeHQOdP863CAQsjxauyr65wiwMOaEZ78KoRjM
[20210408-174215.372817] j6XqYEjm8eq38vHjs/yljseNza/WQkzPSj8AXnwyPsUNdrBpXJaa7W3C7PwuzZhoAH5wDOMer4XaoRspaOfdod09roZJfkhzQsql
[20210408-174215.372826] owTGYN3C4wfiiq3fLBxB+IyE3G8SricjCfrCKNLelF+AMAuf4vgGWsaobAtWyEUJQ1SleuD8DylCq5Wuy/Q4CdcpD7k8NM3JT0WN
[20210408-174215.377238] EPQHjM5QE2aTfJPwiPe1wy9YaKc0wQ1GkEWJSgzTBnI4XqTTsEfUOvBkKmFRbEdT344yhZYRYgJkOU6NotILugbUa9snJmCBPy0E
[20210408-174215.377265] TcwfGfrJ8RsEkj38wrCHtFwzEh5IlDvR8IkM+OOoUgaBT0k0IF1rynZlmRnm2r6sh950ma6lqXgZT8g+WUpWtoFC3xWSC6r/Qxt/
[20210408-174215.377269] XV7qI7qhhRW9OgNv3yfZH6TQa51/KxBtsRGu9KDFnyyQbBwzW+Ung5NcsYTOqIr/JAAAAJPnQZc77ElcAAGLDKMhAAD7R4MUscRn
[20210408-174215.377272] +wIAAAAABFla
[20210408-174215.377275] D/TC:0 0 call_initcalls:21 level 3 tee_cryp_init()
[20210408-174215.377278] D/TC:0 0 call_initcalls:21 level 4 tee_fs_init_key_manager()
[20210408-174215.377281] D/TC:0 0 call_initcalls:21 level 6 mobj_init()
[20210408-174215.381449] D/TC:0 0 call_initcalls:21 level 6 default_mobj_init()
[20210408-174215.381470] D/TC:0 0 call_finalcalls:40 level 1 release_external_dt()
[20210408-174215.381474] I/TC: Primary CPU switching to normal world boot
[20210408-174215.381477] E/TC:0 0
[20210408-174215.381480] E/TC:0 0 Core data-abort at address 0x30200000 (translation fault)
[20210408-174215.381483] E/TC:0 0  esr 0x96000146  ttbr0 0x30091000   ttbr1 0x00000000   cidr 0x0
[20210408-174215.381485] E/TC:0 0  cpu #0          cpsr 0x800003c4
[20210408-174215.381488] E/TC:0 0  x0  0000000030200000 x1  00000000680a0d80
[20210408-174215.381491] E/TC:0 0  x2  0000000000000040 x3  000000000000003f
[20210408-174215.381495] E/TC:0 0  x4  0000000000000001 x5  0000000000000000
[20210408-174215.385353] E/TC:0 0  x6  ffffffffffffffff x7  000000003009e1e0
[20210408-174215.385374] E/TC:0 0  x8  0000000000000020 x9  000000003009e1e8
[20210408-174215.385378] E/TC:0 0  x10 00000000300532dc x11 0000000000000000
[20210408-174215.385380] E/TC:0 0  x12 000000002a77dba1 x13 0000000030055247
[20210408-174215.385384] E/TC:0 0  x14 0000000000000020 x15 0000000000000000
[20210408-174215.385387] E/TC:0 0  x16 0000000030016808 x17 0000000000000000
[20210408-174215.385389] E/TC:0 0  x18 0000000000000000 x19 0000000000000000
[20210408-174215.385392] E/TC:0 0  x20 0000000000000000 x21 0000000000000000
[20210408-174215.385395] E/TC:0 0  x22 0000000000000000 x23 0000000000000000
[20210408-174215.389297] E/TC:0 0  x24 0000000000000000 x25 0000000000000000
[20210408-174215.389320] E/TC:0 0  x26 0000000000000000 x27 0000000000000000
[20210408-174215.389322] E/TC:0 0  x28 0000000000000000 x29 0000000000000000
[20210408-174215.389326] E/TC:0 0  x30 0000000030000120 elr 0000000030005e5c
[20210408-174215.389329] E/TC:0 0  sp_el0 000000003009e430
[20210408-174215.389332] E/TC:0 0 TEE load address @ 0x30000000
[20210408-174215.389335] E/TC:0 0 Call stack:
[20210408-174215.389338] E/TC:0 0  0x0000000030005e5c
[20210408-174215.389340] E/TC:0 0 Panic 'unhandled pageable abort' at core/arch/arm/kernel/abort.c:540 <abort_handler>
[20210408-174215.389343] E/TC:0 0 TEE load address @ 0x30000000
[20210408-174215.405687] E/TC:0 0 Call stack:
[20210408-174215.405717] E/TC:0 0  0x000000003000b2b4
[20210408-174215.405720] E/TC:0 0  0x00000000300180b0
[20210408-174215.405722] E/TC:0 0  0x0000000030009198

Looks like there is something wrong in the MMU configuration, and that it just shows differently in those two configurations ? At first sight, the high ranges in mmap_table do not seem to match something reasonable in the rk3399 address mapping (0xff200000 is where SPI4 finishes and SPI5 starts, and 0xff3fffff is in the middle of a 192KB reserved range).

OTOH the particular access is in the middle of the RAM that looks like just got protected for the secure world usage. Could that be an access from normal world already ?

jforissier commented 3 years ago

Does the call stack or panic call stack give a clue? (use scripts/symbolize.py -d path/to/tee.elf to decode it).

ydirson commented 3 years ago

scripts/symbolize.py -d path/to/tee.elf

E/TC:0 0 Call stack:
E/TC:0 0  0x0000000030005e5c dcache_cleaninv_range at core/arch/arm/kernel/cache_helpers_a64.S:47
E/TC:0 0 Panic 'unhandled pageable abort' at core/arch/arm/kernel/abort.c:540 <abort_handler>
E/TC:0 0 TEE load address @ 0x30000000
E/TC:0 0 Call stack:
E/TC:0 0  0x000000003000b2b4 print_kernel_stack at /usr/src/debug/optee-os/3.12.0+gitAUTOINC+3d47a131bc-r0/git/core/arch/arm/kernel/unwind_arm64.c:79
E/TC:0 0  0x00000000300180b0 __do_panic at /usr/src/debug/optee-os/3.12.0+gitAUTOINC+3d47a131bc-r0/git/core/kernel/panic.c:24
E/TC:0 0  0x0000000030009198 get_fault_type at /usr/src/debug/optee-os/3.12.0+gitAUTOINC+3d47a131bc-r0/git/core/arch/arm/kernel/abort.c:463

that line is panic("[abort] undefined abort (trap CPU)");, we don't see who brought us there

jforissier commented 3 years ago

Yes the panic stack is obviously not helpful since we know it was called due to a data abort -- sorry for implying it might be.

What about this line:

E/TC:0 0  0x0000000030005e5c dcache_cleaninv_range at core/arch/arm/kernel/cache_helpers_a64.S:47

Could it be core/arch/arm/kernel/entry_a64.S line 232? Perhaps there is something wrong with cached_mem_end? What if you comment out the call to dcache_cleaninv_range? I'm speculating here...

ydirson commented 3 years ago

Could it be core/arch/arm/kernel/entry_a64.S line 232? Perhaps there is something wrong with cached_mem_end? What if you comment out the call to dcache_cleaninv_range? I'm speculating here...

Nice guess, it seems :) Commenting out that line allows the boot to proceed:

[20210408-220933.587231] I/TC: Primary CPU switching to normal world boot
[20210408-220933.605928] INFO:    BL31: Preparing for EL3 exit to normal world
[20210408-220933.605959] INFO:    Entry point address = 0x200000
[20210408-220933.605965] INFO:    SPSR = 0x3c9
[20210408-220933.605971] VERBOSE: Argument #0 = 0x0
[20210408-220933.605975] VERBOSE: Argument #1 = 0x0
[20210408-220933.605980] VERBOSE: Argument #2 = 0x0
[20210408-220933.605985] VERBOSE: Argument #3 = 0x0
[20210408-220933.605990] VERBOSE: Argument #4 = 0x0
[20210408-220933.605995] VERBOSE: Argument #5 = 0x0
[20210408-220933.606000] VERBOSE: Argument #6 = 0x0
[20210408-220933.606004] VERBOSE: Argument #7 = 0x0
[20210408-220933.606009] ns16550_serial serial@ff1a0000: pinctrl_select_state_full: uclass_get_device_by_phandle_id: err=-19
[20210408-220934.448253]
[20210408-220934.448273]
[20210408-220934.448276] U-Boot 2021.01 (Jan 11 2021 - 18:11:43 +0000)
...
[20210408-220939.265962] Starting kernel ...
[20210408-220939.265969]
[20210408-220939.265973] I/TC: Secondary CPU 1 initializing
[20210408-220939.690743] I/TC: Secondary CPU 1 switching to normal world boot
[20210408-220939.690768] I/TC: Secondary CPU 2 initializing
[20210408-220939.690773] I/TC: Secondary CPU 2 switching to normal world boot
[20210408-220939.690777] I/TC: Secondary CPU 3 initializing
[20210408-220939.690781] I/TC: Secondary CPU 3 switching to normal world boot
[20210408-220939.690785] I/TC: Secondary CPU 4 initializing
[20210408-220939.690789] D/TC:4   select_vector:1126 SMCCC_ARCH_WORKAROUND_1 (0x80008000) available
[20210408-220939.690793] D/TC:4   select_vector:1128 SMC Workaround for CVE-2017-5715 used
[20210408-220939.690797] I/TC: Secondary CPU 4 switching to normal world boot
[20210408-220939.708952] I/TC: Secondary CPU 5 initializing
[20210408-220939.708976] D/TC:5   select_vector:1126 SMCCC_ARCH_WORKAROUND_1 (0x80008000) available
[20210408-220939.708982] D/TC:5   select_vector:1128 SMC Workaround for CVE-2017-5715 used
[20210408-220939.708986] I/TC: Secondary CPU 5 switching to normal world boot
[20210408-220939.708990] [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[20210408-220939.774592] [    0.000000] Linux version 5.10.21-rockchip (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.36.1.20210209) #1 SMP PREEMPT Tue Mar 9 04:04:38 UTC 2021
...

The normal world seems to work correctly.

ydirson commented 3 years ago

Note that does not seem to help if re-enable ASLR.

lxxstone commented 3 years ago

4496 might be helpful.

jforissier commented 3 years ago

4496 might be helpful.

4496 is closed; how did you finally fix the issue? Will you open a PR?

lxxstone commented 3 years ago

One is to set TEE_LOAD_ADDR and TEE_RAM_START to the same value in platform_config.h, And the other is to replace TEE_RAM_START with __text_start in the entry_axx. s relocate function.

ydirson commented 3 years ago

It seems that in the rk3399 case TEE_LOAD_ADDR is not defined either, so core_mmu.h is already supposed to set it to TEE_RAM_START: without too much surprise setting it explicitly in platform_config.h does not change things.

If I try to eplace TEE_RAM_START with __text_start in the entry_a64.S in relocate, the assembler complains about mov_imm x1, __text_startsaying Error: non-constant expression in ".if" statement. @lxxstone on which optee version and with which toolchain were you able to make this change ?

lxxstone commented 3 years ago
       adr_l   x1, __text_start
       //mov_imm       x1, TEE_RAM_START
ydirson commented 3 years ago
       adr_l   x1, __text_start
       //mov_imm       x1, TEE_RAM_START

Ah ah, should have grabbed a coffee before touching asm after all these years ;)

Unfortunately it does not fix my issue.

lxxstone commented 3 years ago

You can have a drink now. Where are your logs now?

ydirson commented 3 years ago

not only it needs CFG_EARLY_CONSOLE* as listed above,

Regarding this particular point, feel free to make a pull request so that this platform gets a working console by default.

See https://github.com/OP-TEE/optee_os/pull/4551

ydirson commented 3 years ago

Where are your logs now?

Quite similar to what they were:

[20210412-110554.287829] INFO:    BL31: Initializing BL32
[20210412-110554.287832] INFO:    opteed: entering opteed_init
[20210412-110554.287834] INFO:    opteed: optee_entry_point = 0x6e130
[20210412-110554.287837] INFO:    opteed: entering OP-TEE
[20210412-110554.287839] D/TC:0   get_aslr_seed:1360 Bad fdt: -9
[20210412-110554.287841] D/TC:0   add_phys_mem:557 TEE_SHMEM_START type NSEC_SHM 0x32000000 size 0x00400000
[20210412-110554.287844] D/TC:0   add_phys_mem:557 TA_RAM_START type TA_RAM 0x30200000 size 0x01e00000
[20210412-110554.287846] D/TC:0   add_phys_mem:557 VCORE_UNPG_RW_PA type TEE_RAM_RW 0x3005f000 size 0x001a1000
[20210412-110554.292399] D/TC:0   add_phys_mem:557 VCORE_UNPG_RX_PA type TEE_RAM_RX 0x30000000 size 0x0005f000
[20210412-110554.292418] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x06E00000), CORE_MMU_PGDIR_SIZE) type IO_SEC 0xfee00000 size 0x00200000
[20210412-110554.292422] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x071A0000), CORE_MMU_PGDIR_SIZE) type IO_NSEC 0xff000000 size 0x00200000
[20210412-110554.292424] D/TC:0   add_phys_mem:557 ROUNDDOWN((0xF8000000 + 0x07330000), CORE_MMU_PGDIR_SIZE) type IO_SEC 0xff200000 size 0x00200000
[20210412-110554.292427] D/TC:0   add_va_space:597 type RES_VASPACE size 0x00a00000
[20210412-110554.297127] D/TC:0   add_va_space:597 type SHM_VASPACE size 0x02000000
[20210412-110554.297152] D/TC:0   dump_mmap_table:704 type IO_SEC       va 0x2a800000..0x2a9fffff pa 0xff200000..0xff3fffff size 0x00200000 (pgdir)
[20210412-110554.297156] D/TC:0   dump_mmap_table:704 type IO_NSEC      va 0x2aa00000..0x2abfffff pa 0xff000000..0xff1fffff size 0x00200000 (pgdir)
[20210412-110554.297158] D/TC:0   dump_mmap_table:704 type IO_SEC       va 0x2ac00000..0x2adfffff pa 0xfee00000..0xfeffffff size 0x00200000 (pgdir)
[20210412-110554.297161] D/TC:0   dump_mmap_table:704 type NSEC_SHM     va 0x2ae00000..0x2b1fffff pa 0x32000000..0x323fffff size 0x00400000 (pgdir)
[20210412-110554.301612] D/TC:0   dump_mmap_table:704 type TA_RAM       va 0x2b200000..0x2cffffff pa 0x30200000..0x31ffffff size 0x01e00000 (pgdir)
[20210412-110554.301632] D/TC:0   dump_mmap_table:704 type RES_VASPACE  va 0x2d200000..0x2dbfffff pa 0x00000000..0x009fffff size 0x00a00000 (pgdir)
[20210412-110554.301635] D/TC:0   dump_mmap_table:704 type SHM_VASPACE  va 0x2de00000..0x2fdfffff pa 0x00000000..0x01ffffff size 0x02000000 (pgdir)
[20210412-110554.301637] D/TC:0   dump_mmap_table:704 type TEE_RAM_RX   va 0x30000000..0x3005efff pa 0x30000000..0x3005efff size 0x0005f000 (smallpg)
[20210412-110554.321315] D/TC:0   dump_mmap_table:704 type TEE_RAM_RW   va 0x3005f000..0x301fffff pa 0x3005f000..0x301fffff size 0x001a1000 (smallpg)
[20210412-110554.321347] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 1 / 5
[20210412-110554.321353] D/TC:0   core_mmu_entry_to_finer_grained:750 xlat tables used 2 / 5
lxxstone commented 3 years ago

[20210412-110554.287839] D/TC:0 get_aslr_seed:1360 Bad fdt: -9 This shows that the seed value is 0. if (IS_ENABLED(CFG_CORE_ASLR) && seed) then the judgment in init_mem_map() is false, ASLR does not take effect.

ydirson commented 3 years ago

OK, so activating KASLR seems to require that a previous boot stage fills /secure-chosen/kaslr-seed.

For real hardware it seems to be u-boot's job, but in u-boot code this seems to depend on CONFIG_ARMV8_SEC_FIRMWARE_SUPPORT, whose help seems to imply making use of that feature is not just a matter of turning it on. However there does not seem to be lots of online material describing this.

Is it normal that without it the boot does not proceed (could it be the panic() call just following ASLR handling?) ? Isn't it a problem, with optee_os defaulting to CFG_CORE_ASLR=y, and u-boot to CONFIG_ARMV8_SEC_FIRMWARE_SUPPORT off ?

jforissier commented 3 years ago

Is it normal that without it the boot does not proceed (could it be the panic() call just following ALSR handling?) ? Isn't it a problem, with optee_os defaulting to CFG_CORE_ASLR=y, and u-boot to CONFIG_ARMV8_SEC_FIRMWARE_SUPPORT off ?

No. OP-TEE should be able to boot fine with CFG_CORE_ASLR=y and no /secure-chosen/kaslr-seed in the DT (in that case the relocation offset would be zero, i.e., no relocation). Test case: the QEMU platform with QEMU 5.0 or older (prior to this patch: https://github.com/qemu/qemu/commit/60592cfed2b).

ydirson commented 3 years ago

Could it be core/arch/arm/kernel/entry_a64.S line 232? Perhaps there is something wrong with cached_mem_end? What if you comment out the call to dcache_cleaninv_range? I'm speculating here...

Nice guess, it seems :) Commenting out that line allows the boot to proceed

How can we make progress here ? I guess this change was not a fix, but just a test to help to pinpoint the real issue, right ?

jforissier commented 3 years ago

@ydirson correct. You should check the values of x0 and x1. x0 is supposed to be the VA of __text_start after relocation, so it should be the value found in the binary (grep __text_start$ out/arm/core/tee.dmp) plus the load offset which should be zero in your case since you provide no ASLR seed. x1 is a size, I would expect a few hundred K (the TEE core size plus embedded data). To print x0 and x1, if you don't have a debugger you could cause a crash and let the abort handler dump the registers. Something like this:

diff --git a/core/arch/arm/kernel/entry_a64.S b/core/arch/arm/kernel/entry_a64.S
index a6588d86..06a98806 100644
--- a/core/arch/arm/kernel/entry_a64.S
+++ b/core/arch/arm/kernel/entry_a64.S
@@ -237,6 +237,8 @@ clear_nex_bss:
        adr_l   x0, __text_start
        ldr     x1, cached_mem_end
        sub     x1, x1, x0
+       mov     x2, 0
+       ldr     x2, [x2] /* Crash */
        bl      dcache_cleaninv_range
ydirson commented 3 years ago

Damned, I can't seem to reproduce that bug any more. Sure I made a couple of build changes since I applied your patch, but even making sure the conf.mk matches the one in my post, and restoring the debug patches I had added at that time to discover the need to disable ASLR... nothing seems to bring it back.

At least enabling ASLR still blocks the boot :)

github-actions[bot] commented 3 years ago

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

Dvergatal commented 3 years ago

Hi @ydirson, have you managed to solve the problem? Because i had the same problem as you with mine friefly-rk3399 and i stopped on using optee in 3.7.0 version because upper was not running for me either.

ydirson commented 3 years ago

Hi @Dvergatal,

I do not work actively on this platform these days, so I don't have anything newer to report than in last April: it worked as long as ASLR was disabled (building as a Yocto system using yocto.org's meta-rockchip).

Dvergatal commented 3 years ago

Hi @ydirson, thx for quick response. So, in fact, if ASLR is being turned off the TEE is insecure.... Even so i have to check this out because i wasn't even able to run higher version than 3.7.0 as you did. This was my issue #4197.

Found that calling of function opteed_enter_sp in opteed_synchronous_sp function in file services/spd/opteed/opteed_common.c causes the hang. And now it gets tough, because from what i can see this function is in assembler ./services/spd/opteed/opteed_helpers.S and i dunno how to debug it... INFO macro is no longer accessible.

This is what i have found in my case.

petegriffin commented 2 years ago

Hi folks,

Just fyi I just hit this same issue on v3.16 of OP-TEE and 2022.04 of U-Boot. I also find it is a transient issue that comes and goes with unrelated changes. When it was re-producing, I would get a TEE panic on TA_RAM existing to bl33 normal world (which the commenting out of dcache_inv in start.S would work around). Currently it is hanging before issuing any prints in OP-TEE. I have also seen it take ~15+ seconds before OP-TEE can issue the first console print.

In my case this issue first showed up when I disabled SPL_ATF_NO_PLATFORM_PARAM U-boot configuration issue which means U-Boot SPL will pass the DT physical address to OP-TEE, and OP-TEE will create extra memory mapping to access it and get some of its config from DT.

It appears that something with the memory configuration and mappings is wrong in the rockchip optee port and has been over numerous releases.

petegriffin commented 2 years ago

tee-core-panic.log

jforissier commented 2 years ago

Reopening since this issue has obviously not been fixed. The good news is, I found the root cause of the crash: https://github.com/u-boot/u-boot/blob/v2022.04/arch/arm/mach-rockchip/make_fit_atf.py#L214.

U-Boot uses tee.elf as the OP-TEE image, but since https://github.com/OP-TEE/optee_os/commit/5dd1570ac5b0f6563b1a9c074533a19107b8222d this does not work anymore. tee.elf should always have been considered as an intermediate file, not a loadable image, but unfortunately it wasn't clear I suppose.

The above crash happens when the TEE core tries to clean and invalidate the data cache on the boot CPU. For this it relies on embedded meta-data stored after the ELF stuff by scripts/gen_tee_bin.py. The size of the region to invalidate is a 32-bit value stored at __data_end as shown here: https://github.com/OP-TEE/optee_os/blob/3.17.0/core/arch/arm/kernel/entry_a64.S#L125-L126. Since U-Boot doesn't use tee.bin it doesn't have the embedded data and the TEE core reads a bogus value. If it happens to be small, everything works fine; otherwise if it extends past the memory mapped for OP-TEE we get a data-abort.

jforissier commented 2 years ago

U-Boot patch here: https://git.codelinaro.org/linaro/dependable-boot/meta-ts/-/blob/8b2e068d/meta-trustedsubstrate/recipes-bsp/u-boot/u-boot/0001-mach-rockchip-make_fit_atf.py-support-OP-TEE-tee.bin.patch Not yet sent upstream. https://lists.denx.de/pipermail/u-boot/2022-May/483917.html

Dvergatal commented 2 years ago

Ohhh nice to here it:D thx @jforissier

github-actions[bot] commented 2 years ago

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.