zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.8k stars 6.58k forks source link

Architecture Limitations for llext on nRF9160-DK #80170

Open SunnyVibrant opened 1 week ago

SunnyVibrant commented 1 week ago

Describe the bug

We are using the nRF9160-DK (ARM platform) and are running the llext examples. However, we encounter a NULL return when calling llext_find_sym(). Could you please confirm if there are any architecture limitations or known issues related to using llext on the nRF9160-DK?

We are using the module below to verify the llext functionality: https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/subsys/llext/modules

Here are the configurations we tried:

We commented the printk statement, and a crash is observed. Logs are provided in the link below: Crash logs: https://drive.google.com/file/d/1TWCxfk2OyxBhDFBD4J0x43b8Su7x49Sn/view?usp=sharing

Code reference: https://github.com/zephyrproject-rtos/zephyr/blob/main/samples/subsys/llext/modules/src/hello_world_ext.c#L24

Thank you for your support!

To Reproduce Steps to reproduce the behavior:

  1. Clone the repository and navigate to samples/subsys/llext/modules
  2. Modify the configuration as described above
  3. Run the project on the nRF9160-DK
  4. Observe the crash and the NULL return from llext_find_sym()

Expected behavior llext_find_sym() should not return NULL when using llext on the nRF9160-DK, and the example should run without crashing.

Impact This issue is a showstopper as it prevents us from verifying the llext functionality on the nRF9160-DK.

Logs and console output Crash logs: https://drive.google.com/file/d/1TWCxfk2OyxBhDFBD4J0x43b8Su7x49Sn/view?usp=sharing Logs https://drive.google.com/file/d/1TWCxfk2OyxBhDFBD4J0x43b8Su7x49Sn/view?usp=drive_link

Environment (please complete the following information):

github-actions[bot] commented 1 week ago

Hi @SunnyVibrant! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

henrikbrixandersen commented 1 week ago

Can you reproduce this issue on current main branch?

SunnyVibrant commented 1 week ago

Can you reproduce this issue on current main branch? @henrikbrixandersen @teburd Thank you for your prompt response, your support is well appreciated!! Can you please confirm llext is supported in nRF9160 platform??

Yes, we can reproduce llext issue in the main branch.

Zephyr Project LLEXT Module Testing, I also have attached the logs for reference. Logs_github.txt

We took the code from the main branch of the Zephyr repository: Zephyr Main Branch We built the LLEXT module from this sample: LLEXT Module

Case 1: Basic Build, no changes After flashing the build, we observed the following logs from the device: Booting nRF Connect SDK v2.7.99-cs2-d73b286ff6d9 Using Zephyr OS v3.6.99-3d01dcc251bf [00:00:00.393,615] app: Calling hello world as a builtin Hello, world, from the main binary!

Case 2: Adding a Build Flag We added the following flag to the prj.conf file: CONFIG_HELLO_WORLD_MODE=m Here are the console logs from the device.

Booting nRF Connect SDK v2.7.99-cs2-d73b286ff6d9 Using Zephyr OS v3.6.99-3d01dcc251bf [00:00:00.383,514] app: Calling hello world as a builtin Hello, world, from the main binary! Booting nRF Connect SDK v2.7.99-cs2-d73b286ff6d9 Using Zephyr OS v3.6.99-3d01dcc251bf [00:00:00.393,951] app: Calling hello world as a module [00:00:00.400,146] llext: llext_load: Loading relocatable or shared elf [00:00:00.407,714] llext: do_llext_load: Finding ELF tables... [00:00:00.414,459] llext: llext_find_tables: section 0 at 2bc: name 0, type 0, flags 0, addr 0, size 0 [00:00:00.424,652] llext: llext_find_tables: section 1 at 2e4: name 31, type 1, flags 6, addr 0, size 24 [00:00:00.435,058] llext: llext_find_tables: section 2 at 30c: name 27, type 9, flags 40, addr 0, size 16 [00:00:00.445,526] llext: llext_find_tables: section 3 at 334: name 37, type 1, flags 2, addr 0, size 44 [00:00:00.455,932] llext: llext_find_tables: section 4 at 35c: name 49, type 1, flags 2, addr 0, size 8 [00:00:00.466,217] llext: llext_find_tables: section 5 at 384: name 45, type 9, flags 40, addr 0, size 16 [00:00:00.476,715] llext: llext_find_tables: section 6 at 3ac: name 94, type 1, flags 3, addr 0, size 0 [00:00:00.486,999] llext: llext_find_tables: section 7 at 3d4: name 100, type 8, flags 3, addr 0, size 0 [00:00:00.497,406] llext: llext_find_tables: section 8 at 3fc: name 105, type 1, flags 30, addr 0, size 35 [00:00:00.507,965] llext: llext_find_tables: section 9 at 424: name 114, type 1879048195, flags 0, addr 0, size 54 [00:00:00.519,226] llext: llext_find_tables: section 10 at 44c: name 1, type 2, flags 0, addr 0, size 256 [00:00:00.529,663] llext: llext_find_tables: symtab at 10 [00:00:00.536,010] llext: llext_find_tables: section 11 at 474: name 9, type 3, flags 0, addr 0, size 60 [00:00:00.546,386] llext: llext_find_tables: strtab at 11 [00:00:00.552,734] llext: llext_find_tables: section 12 at 49c: name 17, type 3, flags 0, addr 0, size 130 [00:00:00.563,262] llext: llext_find_tables: shstrtab at 12 [00:00:00.569,763] llext: do_llext_load: Allocate and copy strings... [00:00:00.577,178] llext: do_llext_load: Mapping ELF sections... [00:00:00.584,075] llext: llext_map_sections: section 0 name [00:00:00.590,728] llext: llext_map_sections: Not copied section [00:00:00.597,747] llext: llext_map_sections: section 1 name .text [00:00:00.604,858] llext: llext_map_sections: section 2 name .rel.text [00:00:00.612,274] llext: llext_map_sections: Not copied section .rel.text [00:00:00.620,086] llext: llext_map_sections: section 3 name .rodata [00:00:00.627,380] llext: llext_map_sections: section 4 name ._llext_const_symbol.static.hello_worldsym [00:00:00.637,847] llext: llext_map_sections: Not copied section ._llext_const_symbol.static.hello_worldsym [00:00:00.648,681] llext: llext_map_sections: section 5 name .rel._llext_const_symbol.static.hello_worldsym [00:00:00.659,515] llext: llext_map_sections: Not copied section .rel._llext_const_symbol.static.hello_worldsym [00:00:00.670,654] llext: llext_map_sections: section 6 name .data [00:00:00.677,764] llext: llext_map_sections: section 7 name .bss [00:00:00.684,783] llext: llext_map_sections: section 8 name .comment [00:00:00.692,138] llext: llext_map_sections: Not copied section .comment [00:00:00.699,859] llext: llext_map_sections: section 9 name .ARM.attributes [00:00:00.707,824] llext: llext_map_sections: Not copied section .ARM.attributes [00:00:00.716,156] llext: llext_map_sections: section 10 name .symtab [00:00:00.723,510] llext: llext_map_sections: Not copied section .symtab [00:00:00.731,109] llext: llext_map_sections: section 11 name .strtab [00:00:00.738,464] llext: llext_map_sections: Not copied section .strtab [00:00:00.746,093] llext: llext_map_sections: section 12 name .shstrtab [00:00:00.753,631] llext: llext_map_sections: Not copied section .shstrtab [00:00:00.761,413] llext: do_llext_load: Allocate and copy sections... [00:00:00.768,951] llext: do_llext_load: Counting exported symbols... [00:00:00.776,275] llext: llext_count_export_syms: symbol count 16 [00:00:00.783,386] llext: llext_count_export_syms: unhandled symbol 1, name , type tag 3, bind 0, sect 1 [00:00:00.793,792] llext: llext_count_export_syms: unhandled symbol 2, name , type tag 3, bind 0, sect 3 [00:00:00.804,168] llext: llext_count_export_syms: unhandled symbol 3, name , type tag 3, bind 0, sect 4 [00:00:00.814,575] llext: llext_count_export_syms: unhandled symbol 4, name , type tag 3, bind 0, sect 6 [00:00:00.824,951] llext: llext_count_export_syms: unhandled symbol 5, name , type tag 3, bind 0, sect 7 [00:00:00.835,357] llext: llext_count_export_syms: unhandled symbol 6, name , type tag 3, bind 0, sect 8 [00:00:00.845,733] llext: llext_count_export_syms: unhandled symbol 7, name , type tag 3, bind 0, sect 9 [00:00:00.856,140] llext: llext_count_export_syms: unhandled symbol 8, name hello_world_ext.c, type tag 4, bind 0, sect 65521 [00:00:00.868,316] llext: llext_count_export_syms: unhandled symbol 9, name $d, type tag 0, bind 0, sect 3 [00:00:00.878,906] llext: llext_count_export_syms: unhandled symbol 10, name $t, type tag 0, bind 0, sect 1 [00:00:00.889,526] llext: llext_count_export_syms: unhandled symbol 11, name $d, type tag 0, bind 0, sect 1 [00:00:00.900,177] llext: llext_count_export_syms: unhandled symbol 12, name $d, type tag 0, bind 0, sect 4 [00:00:00.910,827] llext: llext_count_export_syms: unhandled symbol 13, name hello_world_sym, type tag 1, bind 0, sect 4 [00:00:00.922,607] llext: llext_count_export_syms: unhandled symbol 14, name printk, type tag 0, bind 1, sect 0 [00:00:00.933,593] llext: llext_count_export_syms: function symbol 15, name hello_world, type tag 2, bind 1, sect 1 [00:00:00.944,915] llext: do_llext_load: Allocating memory for symbol table... [00:00:00.953,033] llext: do_llext_load: Copying symbols... [00:00:00.959,564] llext: llext_copy_symbols: function symbol 0 name hello_world addr 0x2000d5c1 [00:00:00.969,207] llext: do_llext_load: Linking ELF... [00:00:00.975,402] llext: llext_link: relocation section .rel.text (2) linked to section 10 has 2 relocations [00:00:00.986,236] llext: llext_link: relocation 2:0 info 202 (type 2, sym 2) offset 16 sym_name sym_type 3 sym_bind 0 sym_ndx 3 [00:00:00.998,748] llext: found section symbol addr 0x2000d7c0 [00:00:01.005,615] llext: writing relocation symbol type 2 sym 2 at addr 0x2000d5d0 addr 0x2000d7c0 [00:00:01.015,625] elf: arch_elf_relocate: 2 2000d5d0 2000d7c0 [00:00:01.022,521] llext: llext_link: relocation 2:1 info e02 (type 2, sym 14) offset 20 sym_name printk sym_type 0 sym_bind 1 sym_ndx 0 [00:00:01.035,644] llext: found symbol printk at 0x113c3 [00:00:01.041,900] llext: writing relocation symbol printk type 2 sym 14 at addr 0x2000d5d4 addr 0x113c3 [00:00:01.052,246] elf: arch_elf_relocate: 2 2000d5d4 113c3 printk [00:00:01.059,417] llext: llext_link: relocation section .rel._llext_const_symbol.static.hello_worldsym (5) linked to section 10 has 2 relocations [00:00:01.073,608] llext: llext_link: relocation 5:0 info 202 (type 2, sym 2) offset 0 sym_name sym_type 3 sym_bind 0 sym_ndx 3 [00:00:01.086,059] llext: found section symbol addr 0x2000d7c0 [00:00:01.092,895] llext: writing relocation symbol type 2 sym 2 at addr 0x2000d5c0 addr 0x2000d7c0 [00:00:01.102,935] elf: arch_elf_relocate: 2 2000d5c0 2000d7c0 [00:00:01.109,832] llext: llext_link: relocation 5:1 info f02 (type 2, sym 15) offset 4 sym_name hello_world sym_type 2 sym_bind 1 sym_ndx 1 [00:00:01.123,291] llext: found section symbol hello_world addr 0x2000d5c1 [00:00:01.131,103] llext: writing relocation symbol hello_world type 2 sym 15 at addr 0x2000d5c4 addr 0x2000d5c1 [00:00:01.142,181] elf: arch_elf_relocate: 2 2000d5c4 2000d5c1 hello_world [00:00:01.149,993] llext: do_llext_load: loaded module, .text at 0x2000d5c0, .rodata at 0x2000d7c0 [00:00:01.159,851] llext: Loaded extension ext [00:00:01.165,191] app: Failed to find symbol

Case 3: Modifying the Code We commented out the following line from hello_world_ext.c in the module: printk("Hello, world, from an llext!\n");

Here are the logs after flashing the modified build: Booting nRF Connect SDK v2.7.99-cs2-d73b286ff6d9 Using Zephyr OS v3.6.99-3d01dcc251bf [00:00:00.285,400] app: Calling hello world as a module [00:00:00.291,595] llext: llext_load: Loading relocatable or shared elf [00:00:00.299,133] llext: do_llext_load: Finding ELF tables... [00:00:00.305,877] llext: llext_find_tables: section 0 at 258: name 0, type 0, flags 0, addr 0, size 0 [00:00:00.316,101] llext: llext_find_tables: section 1 at 280: name 27, type 1, flags 6, addr 0, size 12 [00:00:00.326,477] llext: llext_find_tables: section 2 at 2a8: name 33, type 1, flags 2, addr 0, size 12 [00:00:00.336,853] llext: llext_find_tables: section 3 at 2d0: name 45, type 1, flags 2, addr 0, size 8 [00:00:00.347,167] llext: llext_find_tables: section 4 at 2f8: name 41, type 9, flags 40, addr 0, size 16 [00:00:00.357,635] llext: llext_find_tables: section 5 at 320: name 90, type 1, flags 3, addr 0, size 0 [00:00:00.367,919] llext: llext_find_tables: section 6 at 348: name 96, type 8, flags 3, addr 0, size 0 [00:00:00.378,234] llext: llext_find_tables: section 7 at 370: name 101, type 1, flags 30, addr 0, size 35 [00:00:00.388,793] llext: llext_find_tables: section 8 at 398: name 110, type 1879048195, flags 0, addr 0, size 54 [00:00:00.400,024] llext: llext_find_tables: section 9 at 3c0: name 1, type 2, flags 0, addr 0, size 224 [00:00:00.410,369] llext: llext_find_tables: symtab at 9 [00:00:00.416,656] llext: llext_find_tables: section 10 at 3e8: name 9, type 3, flags 0, addr 0, size 53 [00:00:00.427,001] llext: llext_find_tables: strtab at 10 [00:00:00.433,349] llext: llext_find_tables: section 11 at 410: name 17, type 3, flags 0, addr 0, size 126 [00:00:00.443,878] llext: llext_find_tables: shstrtab at 11 [00:00:00.450,347] llext: do_llext_load: Allocate and copy strings... [00:00:00.457,763] llext: do_llext_load: Mapping ELF sections... [00:00:00.464,660] llext: llext_map_sections: section 0 name [00:00:00.471,313] llext: llext_map_sections: Not copied section [00:00:00.478,332] llext: llext_map_sections: section 1 name .text [00:00:00.485,412] llext: llext_map_sections: section 2 name .rodata [00:00:00.492,706] llext: llext_map_sections: section 3 name ._llext_const_symbol.static.hello_worldsym [00:00:00.503,143] llext: llext_map_sections: Not copied section ._llext_const_symbol.static.hello_worldsym [00:00:00.513,977] llext: llext_map_sections: section 4 name .rel._llext_const_symbol.static.hello_worldsym [00:00:00.524,780] llext: llext_map_sections: Not copied section .rel._llext_const_symbol.static.hello_worldsym [00:00:00.535,919] llext: llext_map_sections: section 5 name .data [00:00:00.543,029] llext: llext_map_sections: section 6 name .bss [00:00:00.550,018] llext: llext_map_sections: section 7 name .comment [00:00:00.557,373] llext: llext_map_sections: Not copied section .comment [00:00:00.565,063] llext: llext_map_sections: section 8 name .ARM.attributes [00:00:00.573,028] llext: llext_map_sections: Not copied section .ARM.attributes [00:00:00.581,329] llext: llext_map_sections: section 9 name .symtab [00:00:00.588,592] llext: llext_map_sections: Not copied section .symtab [00:00:00.596,191] llext: llext_map_sections: section 10 name .strtab [00:00:00.603,546] llext: llext_map_sections: Not copied section .strtab [00:00:00.611,175] llext: llext_map_sections: section 11 name .shstrtab [00:00:00.618,682] llext: llext_map_sections: Not copied section .shstrtab [00:00:00.626,464] llext: do_llext_load: Allocate and copy sections... [00:00:00.633,972] llext: do_llext_load: Counting exported symbols... [00:00:00.641,296] llext: llext_count_export_syms: symbol count 14 [00:00:00.648,406] llext: llext_count_export_syms: unhandled symbol 1, name , type tag 3, bind 0, sect 1 [00:00:00.658,782] llext: llext_count_export_syms: unhandled symbol 2, name , type tag 3, bind 0, sect 2 [00:00:00.669,158] llext: llext_count_export_syms: unhandled symbol 3, name , type tag 3, bind 0, sect 3 [00:00:00.679,534] llext: llext_count_export_syms: unhandled symbol 4, name , type tag 3, bind 0, sect 5 [00:00:00.689,910] llext: llext_count_export_syms: unhandled symbol 5, name , type tag 3, bind 0, sect 6 [00:00:00.700,286] llext: llext_count_export_syms: unhandled symbol 6, name , type tag 3, bind 0, sect 7 [00:00:00.710,693] llext: llext_count_export_syms: unhandled symbol 7, name , type tag 3, bind 0, sect 8 [00:00:00.721,069] llext: llext_count_export_syms: unhandled symbol 8, name hello_world_ext.c, type tag 4, bind 0, sect 65521 [00:00:00.733,245] llext: llext_count_export_syms: unhandled symbol 9, name $t, type tag 0, bind 0, sect 1 [00:00:00.743,804] llext: llext_count_export_syms: unhandled symbol 10, name $d, type tag 0, bind 0, sect 2 [00:00:00.754,455] llext: llext_count_export_syms: unhandled symbol 11, name $d, type tag 0, bind 0, sect 3 [00:00:00.765,075] llext: llext_count_export_syms: unhandled symbol 12, name hello_world_sym, type tag 1, bind 0, sect 3 [00:00:00.776,855] llext: llext_count_export_syms: function symbol 13, name hello_world, type tag 2, bind 1, sect 1 [00:00:00.788,146] llext: do_llext_load: Allocating memory for symbol table... [00:00:00.796,295] llext: do_llext_load: Copying symbols... [00:00:00.802,825] llext: llext_copy_symbols: function symbol 0 name hello_world addr 0x2000d541 [00:00:00.812,469] llext: do_llext_load: Linking ELF... [00:00:00.818,664] llext: llext_link: relocation section .rel._llext_const_symbol.static.hello_worldsym (4) linked to section 9 has 2 relocations [00:00:00.832,794] llext: llext_link: relocation 4:0 info 202 (type 2, sym 2) offset 0 sym_name sym_type 3 sym_bind 0 sym_ndx 2 [00:00:00.845,214] llext: found section symbol addr 0x2000d6a0 [00:00:00.852,050] llext: writing relocation symbol type 2 sym 2 at addr 0x0 addr 0x2000d6a0 [00:00:00.861,480] elf: arch_elf_relocate: 2 0 2000d6a0 [00:00:00.870,208] os: SECURE FAULT [00:00:00.875,854] os: Address: 0x0 [00:00:00.880,706] os: Attribution unit violation [00:00:00.886,779] os: r0/a1: 0x000095dd r1/a2: 0x2000d43d r2/a3: 0x00000000 [00:00:00.895,446] os: r3/a4: 0x2000cdf4 r12/ip: 0x00000080 r14/lr: 0x0001177b [00:00:00.904,083] os: xpsr: 0x81000000 [00:00:00.909,271] os: Faulting instruction address (r15/pc): 0x00009848 [00:00:00.917,144] os: >>> ZEPHYR FATAL ERROR 41: Unknown error on CPU 0 [00:00:00.925,018] os: Current thread: 0x2000cc08 (unknown) [00:00:00.931,762] os: Halting system

teburd commented 1 week ago

Does zephyrs test suite for llext pass on this board? I’d review the test suite and compare the various tests and Kconfig option set there. Let me know if the test suite fails in any way!

SunnyVibrant commented 1 week ago

Does zephyrs test suite for llext pass on this board? I’d review the test suite and compare the various tests and Kconfig option set there. Let me know if the test suite fails in any way!

@teburd Thank you for your response! We didn't run the llext test suites. We will run those tests and provide an update. Meanwhile, could you please confirm if llext is supported on the nRF9160?

teburd commented 1 week ago

llext works on cortex-m armv7m, and this is a well supported architecture. That said there are some things that may/may not work out, there's various elf linkages, the need to use EXPORT_SYMBOL() to expose the symbol to the elf loader, and so on. The tests cover all this and verify everything works. The sample requires building in a particular way from what I recall though @pillo79 might have a better answer.

Running the tests should work!

SunnyVibrant commented 1 week ago

llext works on cortex-m armv7m, and this is a well supported architecture. That said there are some things that may/may not work out, there's various elf linkages, the need to use EXPORT_SYMBOL() to expose the symbol to the elf loader, and so on. The tests cover all this and verify everything works. The sample requires building in a particular way that may from what I recall though @pillo79 might have a better answer.

Running the tests should work!

@pillo79 good morning!!

Thank you very much for your continued support. Any assistance you provide is highly appreciated.

Could you kindly help us by providing a list of the required changes and steps to make llext work on the nRF9160 board?

We are specifically looking for guidance on configuration changes (like prj.conf), adjustments related to Relocatable ELF handling, and any relevant build instructions for the llext test suites, including the order in which the modules should be built.

Your help in this matter would be invaluable.

SunnyVibrant commented 1 week ago

@teburd Awesome, good to hear this from you. Thank you very much for your support.

SunnyVibrant commented 1 week ago

@teburd @pillo79 Hello, We ran llext test suites and see that test case: test_load_unload_hello_world fails with system halt.

Here are the test suite logs llext_test_suite_logs.txt

pillo79 commented 1 week ago

Hello @SunnyVibrant and thanks for following up. This is consistent with your previous logs - it appears the MPU is enabled and prevents the CPU from running from "a data buffer" (the way it sees the loaded LLEXT). On most ARM targets, you have two options to work around this:

This should have been picked up automatically by the test suite build though. I don't have that board, but I tried to build it both via west build and west twister and the results differ.

west build tests/subsys/llext/simple -p -T llext.simple.readonly -b nrf9160dk/nrf9160

does NOT use arch-specific flags in testcase.yaml (but interestingly, does properly set STORAGE_WRITABLE!), and results in a non-working build. Instead, the following does use the proper set of flags:

west twister -nv -p nrf9160dk/nrf9160 -T tests/subsys/llext/simple/ -s llext.simple.readonly

(the output in this case is in the arcane twister-out/nrf9160dk_nrf9160/tests/subsys/llext/simple/llext.simple.readonly folder). Can you try flashing that to see if it works properly?

@teburd any idea if this difference is expected behavior? I'm guessing not!

SunnyVibrant commented 4 days ago

Hello @SunnyVibrant and thanks for following up. This is consistent with your previous logs - it appears the MPU is enabled and prevents the CPU from running from "a data buffer" (the way it sees the loaded LLEXT). On most ARM targets, you have two options to work around this:

  • disable CONFIG_ARM_MPU, so this check (and many others) are simply not performed, or
  • enable CONFIG_USERSPACE (if the platform allows) so LLEXT can use the userspace API to update the memory layout and access restrictions.

This should have been picked up automatically by the test suite build though. I don't have that board, but I tried to build it both via west build and west twister and the results differ.

west build tests/subsys/llext/simple -p -T llext.simple.readonly -b nrf9160dk/nrf9160

does NOT use arch-specific flags in testcase.yaml (but interestingly, does properly set STORAGE_WRITABLE!), and results in a non-working build. Instead, the following does use the proper set of flags:

west twister -nv -p nrf9160dk/nrf9160 -T tests/subsys/llext/simple/ -s llext.simple.readonly

(the output in this case is in the arcane twister-out/nrf9160dk_nrf9160/tests/subsys/llext/simple/llext.simple.readonly folder). Can you try flashing that to see if it works properly?

@teburd any idea if this difference is expected behavior? I'm guessing not!

@pillo79 Thank you!! Thank you!! We will try those steps and keep you updated on this.