Open a-lunev opened 3 years ago
I was actually investing some stuff related to nxflat over the weekend to try some thoughts I had on improving the share module story.
You should be able to use modern gcc as outlined here: https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=139629508#content/view/139630111
Specially including this flag
-mno-pic-data-is-text-relative
I has confirmed that that flag does what we expect, but I did not go through the rest of the process.
There is a workaround for this problem noted here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=139630111 . The workaround is to use an newer GCC option |-mno-pic-data-is-text-relative. ||T|hat option restores the original behavior of the older GCC tools. The README file should be updated to reflect this workaround.
This issue (but not the workaround) is discussed in https://cwiki.apache.org/confluence/display/NUTTX/NxFlat as well.
On 5/17/2021 3:43 PM, a-lunev wrote:
Hello, Unfortunately, I can not build eagle100:nxflat and eagle100:thttpd configurations. As I've understood from README files and history files in NuttX repo, there are some not yet resolved issues with new gcc versions, and gcc 4.3.3 was the last version that still worked for NXFLAT mode. I tried to build NXFLAT Toolchain based on gcc 4.3.3 and binutils 2.19.1, however I'm still experiencing NuttX build errors.
Steps to reproduce:
|$ mkdir TEST_ROOT $ git clone https://github.com/apache/incubator-nuttx.git TEST_ROOT/nuttx $ git clone https://github.com/apache/incubator-nuttx-apps TEST_ROOT/apps $ cd TEST_ROOT/nuttx $ ./tools/configure.sh -l eagle100:nxflat |
Build NXFLAT Toolchain:
|$ git clone https://bitbucket.org/nuttx/buildroot.git TEST_ROOT/buildroot/buildroot $ cd TEST_ROOT/buildroot/buildroot $ cp configs/cortexm3-defconfig-nxflat .config $ make oldconfig $ make menuconfig activate the following options: Toolchain Options -> Build GCC cross-compiler Toolchain Options -> Build C++ compiler $ make |
Build NuttX:
$ cd TEST_ROOT/nuttx $ make CROSSDEV=TEST_ROOT/buildroot/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-elf- \ MKNXFLAT=TEST_ROOT/buildroot/buildroot/build_arm_nofpu/staging_dir/bin/mknxflat \ LDNXFLAT=TEST_ROOT/buildroot/buildroot/build_arm_nofpu/staging_dir/bin/ldnxflat There are multiple build errors. The first portion is as follows:
|make[5]: Entering directory '.../TEST_ROOT/apps/examples/nxflat/tests/errno' CC: errno.c LD: errno.o MK: errno.r1 AS: errno-thunk.S LD: errno-thunk.o LD: errno.r2 INPUT SECTIONS: SECT LOW HIGH SIZE TEXT 00000000 0000018a 0000018a DATA 00000000 00000028 00000028 BSS 00000028 00000028 00000000 ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 00000064 to sym '.LC0' [0000010c] ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 00000068 to sym '.LC1' [00000124] ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 0000006c to sym 'g_nonexistent' [000000fc] ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 00000070 to sym '.LC2' [0000013c] ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 00000074 to sym '.LC3' [0000013e] ERROR -- Symbol in GOT32 relocation is in TEXT ERROR -- At addr 00000078 to sym '.LC4' [00000165] Entry symbol "main": 00000024 in section ".text" |
Could you please tell me if I'm doing something wrong or what https://bitbucket.org/nuttx/buildroot.git https://bitbucket.org/nuttx/buildroot.git SHA-1 (including what gcc and binutils version) and NuttX SHA-1 are compatible to each other to make NuttX with enabled NXFLAT working?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-nuttx/issues/3737, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFUG6R7R7Y7XDWVTR6M6O3TOGEXVANCNFSM45BIO7SQ.
I was actually investing some stuff related to nxflat over the weekend to try some thoughts I had on improving the share module story.
I have already implemented full, MMU-less shared library support in a binary format that call XFLAT. You can see that code at http://xflat.sourceforge.net/ (haven't touched that in years). The Sourceforge code is still under CVS! There is a GIT version here: https://bitbucket.org/patacongo/xflat/src/master/
I created NxFLAT as a stripped down version of XFLAT with no shared library support but with a smaller footprint suitable for the kind of MCUs that NuttX originally target.
The objectives of NuttX have changed over the years. Originally, it was intended to be a tiny RTOS with size comparable to the other tiny RTOSs like FreeRTOS and ChibiOS, but still supporting mostly POSIX OS interfaces. So a lot of corners were cut in the original designs to keep the size to a minimum. That objective has morphed over the years: Now we aim to be small (but not tiny) Linux work-alike. Very different concept.
I was actually investing some stuff related to nxflat over the weekend to try some thoughts I had on improving the share module story.
I have already implemented full, MMU-less shared library support in a binary format that call XFLAT. You can see that code at http://xflat.sourceforge.net/ (haven't touched that in years). The Sourceforge code is still under CVS! There is a GIT version here: https://bitbucket.org/patacongo/xflat/src/master/
I created NxFLAT as a stripped down version of XFLAT with no shared library support but with a smaller footprint suitable for the kind of MCUs that NuttX originally target.
The objectives of NuttX have changed over the years. Originally, it was intended to be a tiny RTOS with size comparable to the other tiny RTOSs like FreeRTOS and ChibiOS, but still supporting mostly POSIX OS interfaces. So a lot of corners were cut in the original designs to keep the size to a minimum. That objective has morphed over the years: Now we aim to be small (but not tiny) Linux work-alike. Very different concept.
Not to derail this issue, but what I'm actually wanting to be able to do is support loading elf files on some of these smaller chips without a mmu, but keep the memory usage down by not having to have a bunch of copy of libc included.
Some of the compiler flags to help make this possible do not seem to exist outside of ARM, but there was some recent interest in adding support for RISCV gcc.
I'll read more of the xflat docs / design.
But back to your question @a-lunev please do try the flag we suggested I'm motivated to help remove any roadblocks you run into and maybe I'll spend some time updating the docs.
Was: Re: [apache/incubator-nuttx] eagle100:nxflat and eagle100:thttpd build errors (#3737)
Not to derail this issue, but what I'm actually wanting to be able to do is support loading elf files on some of these smaller chips without a mmu, but keep the memory usage down by not having to have a bunch of copy of libc included.
New thread created.
The way that has been done in the past was too add the libc functions to the base FLASH symbol table. That symbol table draws the libc functions into the base FLASH image. Then there is one copy of the libc functions in base FLASH and no libc functions in the ELF module. The ELF module is linked to the libc functions just as they are linked to the OS inteface functions.
This will accomplish the size decrease you are looking for.
This is why the files libs/libc/libc.csv and math.csv exist: To create C library symbol tables using apps/tools/mksymtab.sh
There was some documentation for doing this somewhere, but I can't remember where now.
Initially (before creating the current issue #3737) I tried to test NXFLAT mode based on lm3s6965-ek:qemu-flat config because I do not have eagle100 board physically. I was able to build NuttX w/o build errors for lm3s6965-ek:qemu-flat using gcc 7.4.0 and binutils 2.28.1 (e7659eb89e1e7c8729d4cb526117c862d9511922 of https://bitbucket.org/nuttx/buildroot.git). I've attached my custom defconfig file with enabled NXFLAT. However, when I run the resulting binary on QEMU, it produced the following output:
Registering romdisk
Mounting ROMFS filesystem at target=/mnt/romfs with source=/dev/ram0
****************************************************************************
* Executing errno
****************************************************************************
ERROR: exec(errno) failed: 2
****************************************************************************
* Executing hello
****************************************************************************
ERROR: exec(hello) failed: 2
****************************************************************************
* Executing struct
****************************************************************************
ERROR: exec(struct) failed: 2
End-of-Test.. Exit-ing
Therefore, I supposed that the execution errors may be because of the new gcc version that was mentioned broken at least since gcc 4.6.3 concerning NXFLAT support. Thus I tried gcc 4.3.3, however it produced build errors even on eagle100:nxflat and eagle100:thttpd configurations in my case (I created the current issue #3737 at that point). I tried to test eagle100:nxflat and eagle100:thttpd because "Furthermore, NXFLAT has only been tested on the Eagle-100 LMS6918 Cortex-M3 board" is written here: https://cwiki.apache.org/confluence/display/NUTTX/NxFlat
Now I've tested your suggestion concerning -mno-pic-data-is-text-relative
flag using gcc 7.4.0, however I've not noticed any difference. lm3s6965-ek:qemu-flat is built w/o build errors not depending on the presence of the flag in Make.defs file of lm3s6965-ek directory. And the execution errors appear also not depending on the presence of the flag.
Steps to reproduce:
$ mkdir TEST_ROOT
$ git clone https://github.com/apache/incubator-nuttx.git TEST_ROOT/nuttx
$ git clone https://github.com/apache/incubator-nuttx-apps TEST_ROOT/apps
replace TEST_ROOT/nuttx/boards/arm/tiva/lm3s6965-ek/configs/qemu-flat/defconfig file by my attached one
$ cd TEST_ROOT/nuttx
$ ./tools/configure.sh -l lm3s6965-ek:qemu-flat
Build NXFLAT Toolchain:
$ git clone https://bitbucket.org/nuttx/buildroot.git TEST_ROOT/buildroot
$ cd TEST_ROOT/buildroot
$ cp configs/cortexm3-eabi-defconfig-7.4.0 .config
$ make oldconfig
$ make
Build NuttX:
$ cd TEST_ROOT/nuttx
$ make CROSSDEV=TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi- \
MKNXFLAT=TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/mknxflat \
LDNXFLAT=TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/ldnxflat
Run on QEMU:
qemu-system-arm -semihosting \
-M lm3s6965evb \
-netdev user,id=user0 \
-nic user,id=user0 \
-serial mon:stdio \
-kernel TEST_ROOT/nuttx/nuttx.bin
ERROR: exec(errno) failed: 2
You can see the meaning of the error in include/errno.h:
#define ENOENT 2
#define ENOENT_STR "No such file or directory"
Which suggests that there is something wrong with your file system or file search PATH. The ELF loader should only return the error 2 if the executable file cannot be found.
That error is printed by nxflat_main.c:
234 errmsg("ERROR: exec(%s) failed: %d\n", dirlist[i], errno);
Do you have a PATH variable set up in the environment? NO, it is not defined in your defconfig file. So the following should be using the absolute path /mnt/romfs/errno. That should work provided that /dev/ram0 is valid.
214 #ifdef CONFIG_LIB_ENVPATH
215 filename = dirlist[i];
216 #else
217 snprintf(fullpath, 128, "%s/%s", MOUNTPT, dirlist[i]);
218 filename = fullpath;
219 #endif
230 args[0] = NULL;
231 ret = exec(filename, args, g_nxflat_exports, g_nxflat_nexports);
Hi @patacongo,
The file system (romfs) and file search PATH are good indeed. The error code ("ERROR: exec(errno) failed: 2") is confusing and does not expose the real cause. This is not "No such file or directory" cause.
I've enabled debug logs and the real cause has been exposed:
testheader:
****************************************************************************
* Executing errno
****************************************************************************
errno
load_absmodule: Loading /mnt/romfs/errno
nxflat_loadbinary: Loading file: /mnt/romfs/errno
nxflat_init: filename: /mnt/romfs/errno loadinfo: 0x20003d50
nxflat_read: Read 36 bytes from offset 0
nxflat_dumploadinfo: LOAD_INFO:
nxflat_dumploadinfo: ISPACE:
nxflat_dumploadinfo: ispace: 00000000
nxflat_dumploadinfo: entryoffs: 000000a4
nxflat_dumploadinfo: isize: 000001ba
nxflat_dumploadinfo: DSPACE:
nxflat_dumploadinfo: dspace: 00000000
nxflat_dumploadinfo: datasize: 00000040
nxflat_dumploadinfo: bsssize: 00000000
nxflat_dumploadinfo: (pad): 00000000
nxflat_dumploadinfo: stacksize: 00000800
nxflat_dumploadinfo: dsize: 00000040
nxflat_dumploadinfo: RELOCS:
nxflat_dumploadinfo: relocstart: 000001fa
nxflat_dumploadinfo: reloccount: 11
nxflat_dumploadinfo: HANDLES:
nxflat_dumploadinfo: filfd: 3
nxflat_load: Mapped ISpace (442 bytes) at 00019d54
nxflat_load: Allocated DSpace (108 bytes) at 0x200040c0
nxflat_read: Read 108 bytes from offset 442
nxflat_load: TEXT: 00019d54 Entry point offset: 000000a4 Data offset: 000001ba
nxflat_dumploadinfo: LOAD_INFO:
nxflat_dumploadinfo: ISPACE:
nxflat_dumploadinfo: ispace: 00019d54
nxflat_dumploadinfo: entryoffs: 000000a4
nxflat_dumploadinfo: isize: 000001ba
nxflat_dumploadinfo: DSPACE:
nxflat_dumploadinfo: dspace: 200040b0
nxflat_dumploadinfo: crefs: 1
nxflat_dumploadinfo: region: 200040c0
nxflat_dumploadinfo: datasize: 00000040
nxflat_dumploadinfo: bsssize: 00000000
nxflat_dumploadinfo: (pad): 0000002c
nxflat_dumploadinfo: stacksize: 00000800
nxflat_dumploadinfo: dsize: 0000006c
nxflat_dumploadinfo: RELOCS:
nxflat_dumploadinfo: relocstart: 000001fa
nxflat_dumploadinfo: reloccount: 11
nxflat_dumploadinfo: HANDLES:
nxflat_dumploadinfo: filfd: 3
nxflat_bindimports: Imports offset: 000001d2 nimports: 5
nxflat_bindimports: Import[0] (0x200040d8) offset: 00000000 func: 00000000
nxflat_bindimports: Exported symbol "__errno" not found
nxflat_loadbinary: Failed to bind symbols program binary: -2
exec_spawn: ERROR: Failed to load program 'errno': -2
nxflat_main: ERROR: exec(errno) failed: 2
As it turned out, symtab.c is generated with an empty g_nxflat_exports array.
Makefile invokes $(APPDIR)/tools/mksymtab.sh script, that in turn invokes the host level "nm":
nm: errno: file format not recognized
As I understand, nxflat example was broken somewhere in 2020. I was able to find that nuttx-8.2 still worked correctly concerning this particular issue (the symbol table is generated successfully, and nxflat example normally works in nuttx-8.2). However, nuttx-9.1.0 and any newer state in nuttx repo are already broken.
I narrowed the range of commits: nuttx ac18fc0216f81f1893b3c5349433136917e352db (15-Apr-2020) apps 404b330c25567923de8434e34dd1dbe8ccf59b8b (27-Feb-2020). Symbol table is still normally created and nxflat example works.
nuttx 9b87732b4708c44de525eefec1fd8a9bfc6c1181 (01-Jun-2020) apps deaa6c5b7bf8445b4a300691525f60aa506be0d7 (20-May-2020) nxflat example does not work
nuttx 2af72cc589aec0a01f73333496bf41a95389c2f4 (04-Jun-2020) apps 2c924f657fd17bb6a8e3b809a2b61c2539ecba52 (04-Jun-2020) nxflat example does not work
I tried about 10 more commits (pairs) in the middle of the range, however there were different build errors.
However, I see there was the main change in d03ff1bde61cb6c2f0e96a5e014077909c700d75 commit (https://github.com/apache/incubator-nuttx-apps) concerning how the symbol table for nxflat example is created. It seems the issue appeared namely in that commit.
Finally, I found the exact commit that broke the symbol table creation for nxflat example: f16a765ccaa9395250423c4498a9e31aac5a558d
I've tried the following correction on top of the current master branch and it fixed the issue:
diff --git a/examples/nxflat/tests/Makefile b/examples/nxflat/tests/Makefile
index 7640a5d0..28c977fa 100644
--- a/examples/nxflat/tests/Makefile
+++ b/examples/nxflat/tests/Makefile
@@ -90,7 +90,7 @@ $(DIRLIST_SRC): install
# Create the exported symbol table list from the derived *-thunk.S files
$(SYMTAB_SRC): install
- $(Q) $(APPDIR)/tools/mksymtab.sh $(ROMFS_DIR) g_nxflat >$@.tmp
+ $(Q) $(APPDIR)/tools/mksymtab.sh $(TESTS_DIR) g_nxflat >$@.tmp
$(Q) $(call TESTANDREPLACEFILE, $@.tmp, $@)
@a-lunev could you provide a patch fix the typo in examples/nxflat? BTW, it would be great to remove the below lines once you fix the build break: https://github.com/apache/incubator-nuttx/blob/15b99d1f4b25a7b0f010a8f118d130428363b6c2/tools/ci/testlist/arm-13.dat#L6-L7 https://github.com/apache/incubator-nuttx/blob/15b99d1f4b25a7b0f010a8f118d130428363b6c2/tools/ci/testlist/all.dat#L2-L3 So we can catch the build issue in the furture automatically.
Hi @xiaoxiang781216,
So far I've provided the patch to fix the symbol table creation for nxflat example, and added configuration for lm3s6965-ek board to test nxflat on QEMU.
Concerning automatic testing, it seems NXFLAT Toolchain is absent in nuttx/tools/ci scripts. Thus first it's necessary to include building the Toolchain in the scripts and deploy.
Hi @btashton and @patacongo,
After the symbol table creation was fixed for nxflat example, I tested -mno-pic-data-is-text-relative
flag again and now it really helped, thank you!
However, there is a hard fault for the "struct" test ("errno" and "hello" tests work well):
$ qemu-system-arm -semihosting -M lm3s6965evb -netdev user,id=user0 -nic user,id=user0 -serial mon:stdio -kernel nuttx.bin
Registering romdisk
Mounting ROMFS filesystem at target=/mnt/romfs with source=/dev/ram0
****************************************************************************
* Executing errno
****************************************************************************
Wait a bit for test completion
Hello, World on stdout
Hello, World on stderr
We failed to open "aflav-sautga-ay!" errno is 2
****************************************************************************
* Executing hello
****************************************************************************
Wait a bit for test completion
Getting ready to say "Hello, world"
Hello, world!
It has been said.
argc = 1
argv = 0x0x20005130
argv[0] = (0x0x20005138) "<noname>"
argv[1] = 0x0
Goodbye, world!
****************************************************************************
* Executing struct
****************************************************************************
Wait a bit for test completion
Calling getstruct()
getstruct returned 0x20004db0
n = 42 (vs 42) PASS
pn = 0x20004da4 (vs 0x20004da4) PASS
*pn = 87 (vs 87) PASS
ps = 0xde5c (vs 0xde5c) PASS
ps->n = 117 (vs 117) PASS
pf = 0xdcec (vs 0xdcec) PASS
Calling mystruct->pf()
arm_hardfault: PANIC!!! Hard fault: 40000000
up_assert: Assertion failed at file:armv7-m/arm_hardfault.c line: 135
up_registerdump: R0: 00000017 0000df4b 00007fff 0000dcec 20004db0 0000dcec 00000000 00000000
up_registerdump: R8: 00000000 00000000 20004d60 00000000 0000a0cf 20006218 0000ddfd 0000dcec
up_registerdump: xPSR: 60000000 PRIMASK: 00000000 CONTROL: 00000000
up_registerdump: EXC_RETURN: fffffff9
up_dumpstate: sp: 20006150
up_dumpstate: stack base: 20005a58
up_dumpstate: stack size: 000007e8
up_stackdump: 20006140: 20005a58 20004e20 20002338 0000498f 00000000 00000000 00000000 0000a0cf
up_stackdump: 20006160: 20006218 0000ddfd 0000dcec 00004085 00000000 0000dcea 0000e60b 200061cc
up_stackdump: 20006180: 00000000 00000000 00000000 000036dd 0000d35c 00000e59 00000e19 20002338
up_stackdump: 200061a0: 00000003 00001137 00000004 00000c99 00000000 200061cc 0000dcec 00000000
up_stackdump: 200061c0: 00000000 00000295 20004fd8 20006218 00000000 20004db0 0000dcec 00000000
up_stackdump: 200061e0: 00000000 00000000 00000000 20004d60 00000000 fffffff9 00000017 0000df4b
up_stackdump: 20006200: 00007fff 0000dcec 0000a0cf 0000ddfd 0000dcec 60000000 0000dd21 20004e20
up_stackdump: 20006220: 0000dd21 000038dd 00000000 00001517 00000000 00000000 00000000 00000000
Do you have idea what the cause may be?
Do you have idea what the cause may be?
No, it is impossible to interpret the hardfault dump without also having the ELF file with the addresses. Without the ELF it is just meaningless numbers. See https://cwiki.apache.org/confluence/display/NUTTX/Analyzing+Cortex-M+Hardfaults
Because printf() uses buffered I/O we don't really know where it failed. The last data buffered by printf() was lost. There was probably output after "Calling mystruct->pf()" that we do not see. It appears that the failure occurred when mystruct->pf() was called. However, it is also likely that mystruct->pf() returned and the crash occurred when the test exited. This happens often if the stack is too small. We can't tell from this data.
[Actually, the printf() buffer should have been flushed by the '\n' at the end of the output, but the data in the serial Tx buffer should still have been lost. Same result.]
You will need to analyze the hardfault dump per the steps at the above link, and/or single step through the mystruct->pf() call, and/or add a lot more printf() WITH fflush() calls.
Hi @patacongo,
It fails namely here (struct_main.c) on attempt to call the function:
97 mystruct->pf();
As you can see from the previous log
pf = 0xdcec (vs 0xdcec) PASS
Calling mystruct->pf()
arm_hardfault: PANIC!!! Hard fault: 40000000
the called address is even. However, in Thumb mode it must be odd. I tried to force it to be odd (by | 1) and the hard fault was fixed.
If I understand correctly, the compiler/linker should have initialized "pf" field of "struct struct_s dummy" to an odd address automatically, however it did not.
Do you have idea what's wrong?
This is the detailed log how "struct" example was built:
...
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi-gcc -c -fpic -msingle-pic-base -mpic-register=r10 -mno-pic-data-is-text-relative -fno-builtin -Wall -Wstrict-prototypes -Wshadow -Wundef -Os -fno-strict-aliasing -fno-strength-reduce -fomit-frame-pointer -mcpu=cortex-m3 -mthumb -mfloat-abi=soft -isystem "TEST_ROOT/nuttx/include" -D__NuttX__ -D__KERNEL__ -pipe -I "TEST_ROOT/apps/include" struct_main.c -o struct_main.o
CC: struct_dummy.c
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi-gcc -c -fpic -msingle-pic-base -mpic-register=r10 -mno-pic-data-is-text-relative -fno-builtin -Wall -Wstrict-prototypes -Wshadow -Wundef -Os -fno-strict-aliasing -fno-strength-reduce -fomit-frame-pointer -mcpu=cortex-m3 -mthumb -mfloat-abi=soft -isystem "TEST_ROOT/nuttx/include" -D__NuttX__ -D__KERNEL__ -pipe -I "TEST_ROOT/apps/include" struct_dummy.c -o struct_dummy.o
LD: struct_main.o
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi-ld -r -d -warn-common -o struct.r1 struct_main.o struct_dummy.o
MK: struct.r1
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/mknxflat -o struct-thunk.S struct.r1
AS: struct-thunk.S
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi-gcc -c -fpic -msingle-pic-base -mpic-register=r10 -mno-pic-data-is-text-relative -fno-builtin -Wall -Wstrict-prototypes -Wshadow -Wundef -Os -fno-strict-aliasing -fno-strength-reduce -fomit-frame-pointer -mcpu=cortex-m3 -mthumb -mfloat-abi=soft -isystem "TEST_ROOT/nuttx/include" -D__NuttX__ -D__KERNEL__ -pipe -I "TEST_ROOT/apps/include" struct-thunk.S -o struct-thunk.o
LD: struct-thunk.o
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/arm-nuttx-eabi-ld -r -d -warn-common -T TEST_ROOT/nuttx/binfmt/libnxflat/gnu-nxflat-pcrel.ld -no-check-sections -o struct.r2 struct_main.o struct_dummy.o struct-thunk.o
LD: struct.r2
TEST_ROOT/buildroot/build_arm_nofpu/staging_dir/bin/ldnxflat -e main -s 2048 -o struct struct.r2
INPUT SECTIONS:
SECT LOW HIGH SIZE
TEXT 00000000 0000026a 0000026a
DATA 00000000 0000001c 0000001c
BSS 0000001c 0000001c 00000000
Entry symbol "main": 00000058 in section ".text"
...
It fails namely here (struct_main.c) on attempt to call the function: |97 mystruct->pf();|
As you can see from the previous log
|pf = 0xdcec (vs 0xdcec) PASS Calling mystruct->pf() arm_hardfault: PANIC!!! Hard fault: 40000000 |
the called address is even. However, in Thumb mode it must be odd. I tried to force it to be odd (by | 1) and the hard fault was fixed.
If I understand correctly, the compiler should have initialized "pf" field of "struct struct_s dummy" to an odd address automatically, however it did not.
Yes, at some point bit 0 should have been set by the compiler before the call. This used to work and I can't explain why it should be failing in this case. It is really pretty generic C code. The only purpose of this test is to assure that a structure is initialized properly.
It does seem like a compiler issue. Could it believe that dummyfunc() is an ARM (vs Thumb2) function?
Sorry. I'm no help on this one.
On 5/23/2021 12:56 PM, Gregory Nutt wrote:
It fails namely here (struct_main.c) on attempt to call the function: |97 mystruct->pf();|
As you can see from the previous log
|pf = 0xdcec (vs 0xdcec) PASS Calling mystruct->pf() arm_hardfault: PANIC!!! Hard fault: 40000000 |
the called address is even. However, in Thumb mode it must be odd. I tried to force it to be odd (by | 1) and the hard fault was fixed.
If I understand correctly, the compiler should have initialized "pf" field of "struct struct_s dummy" to an odd address automatically, however it did not.
Yes, at some point bit 0 should have been set by the compiler before the call. This used to work and I can't explain why it should be failing in this case. It is really pretty generic C code. The only purpose of this test is to assure that a structure is initialized properly.
It does seem like a compiler issue. Could it believe that dummyfunc() is an ARM (vs Thumb2) function?
Sorry. I'm no help on this one.
One I do in cases where I want to see what the compiler is doing is to add -save-temps to the GCC command line. I do this:
That will leave a .i and a .s file in addition to the .o file. The .s has the generated assembly language. I am not sure if it will tell you anything new or now. We already know that the value saved in the structure did not have bit 0 set.
It does seem like a compiler issue. Could it believe that dummyfunc() is an ARM (vs Thumb2) function?
I am not sure how the ARM-Thumb interworking is handled. But the compiler cannot really know if a function address is an ARM or a Thumb2 address. That cannot really be known until the files are linked via ld, right?
In this case, there is a partial link using ld to produce a struct.r2 but the final link is not performed by LD, but by ldnxflat.
I am not sure how the ARM-Thumb interworking is handled. But the compiler cannot really know if a function address is an ARM or a Thumb2 address. That cannot really be known until the files are linked via ld, right?
In this case, there is a partial link using ld to produce a struct.r2 but the final link is not performed by LD, but by ldnxflat.
Yes, I'm also thinking it should be done at some final phase after all address manipulations. In case of GNU Toolchain I think it's done by linker. In case of NXFLAT Toolchain I suppose the final phase might be in ldnxflat. However, I'm not sure. I've not analyzed the source code / architecture of the both Toolchains.
In case of NXFLAT Toolchain I suppose the final phase might be in ldnxflat. However, I'm not sure. I've not analyzed the source code / architecture of the both Toolchains.
Nothing like that is done in ldnxflat. It doesn't know anything about ARM. If any address fix-ups are done, then would have to have been done when struct.r2 was linked. Might be interesting to build the nxflat_main.c and nxflat_dummy.c files using "normal" CFLAGs and see if there is a difference.
That same struct test case is used with ELF modules too. See apps/examples/elf/tests/struct. If also uses 'ld' to produce a partial link but works fine. That says that there is probably nothing wrong with the tools or the procedure. I think we are missing something else.
Looking at the relocations in r1, would we not expect to see a R_ARM_THM_GOT_BREL12
instead of R_ARM_GOT_BREL
❯ readelf -r ../apps/examples/nxflat/tests/struct/struct.r1
Relocation section '.rel.text' at offset 0x700 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000006 0000241e R_ARM_THM_JUMP24 00000000 printf
0000000c 0000111a R_ARM_GOT_BREL 00000000 .LC0
00000018 0000291a R_ARM_GOT_BREL 00000000 dummy
Relocation section '.rel.text.startup' at offset 0x718 contains 26 entries:
Offset Info Type Sym.Value Sym. Name
00000008 0000240a R_ARM_THM_CALL 00000000 printf
0000000c 0000270a R_ARM_THM_CALL 00000011 getstruct
0000001a 0000240a R_ARM_THM_CALL 00000000 printf
00000032 0000240a R_ARM_THM_CALL 00000000 printf
00000050 0000240a R_ARM_THM_CALL 00000000 printf
0000006e 0000240a R_ARM_THM_CALL 00000000 printf
0000008c 0000240a R_ARM_THM_CALL 00000000 printf
000000a6 0000240a R_ARM_THM_CALL 00000000 printf
000000c4 0000240a R_ARM_THM_CALL 00000000 printf
000000d4 0000240a R_ARM_THM_CALL 00000000 printf
000000e2 0000240a R_ARM_THM_CALL 00000000 printf
00000100 0000141a R_ARM_GOT_BREL 00000022 .LC3
00000104 0000151a R_ARM_GOT_BREL 00000037 .LC4
00000108 0000131a R_ARM_GOT_BREL 0000001d .LC2
0000010c 0000161a R_ARM_GOT_BREL 0000004e .LC5
00000110 0000251a R_ARM_GOT_BREL 00000000 dummy_scalar
00000114 0000171a R_ARM_GOT_BREL 00000063 .LC6
00000118 0000181a R_ARM_GOT_BREL 00000079 .LC7
0000011c 0000231a R_ARM_GOT_BREL 00000000 dummy_struct
00000120 0000191a R_ARM_GOT_BREL 0000008f .LC8
00000124 00001a1a R_ARM_GOT_BREL 000000a5 .LC9
00000128 0000121a R_ARM_GOT_BREL 00000018 .LC1
0000012c 0000261a R_ARM_GOT_BREL 00000001 dummyfunc
00000130 00001b1a R_ARM_GOT_BREL 000000be .LC10
00000134 00001c1a R_ARM_GOT_BREL 000000d4 .LC11
00000138 00001d1a R_ARM_GOT_BREL 000000ec .LC12
Relocation section '.rel.data.rel.ro' at offset 0x7e8 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000004 00002502 R_ARM_ABS32 00000000 dummy_scalar
00000008 00002302 R_ARM_ABS32 00000000 dummy_struct
0000000c 00002602 R_ARM_ABS32 00000001 dummyfunc
Ah I take that back, but it does look like when r2 was linked we moved to the even offset address
❯ readelf -r ../apps/examples/nxflat/tests/struct/struct.r2
Relocation section '.rel.text' at offset 0x8f4 contains 30 entries:
Offset Info Type Sym.Value Sym. Name
00000006 0000281e R_ARM_THM_JUMP24 00000025 printf
0000000c 00000d1a R_ARM_GOT_BREL 00000174 .LC0
00000018 0000371a R_ARM_GOT_BREL 0000000c dummy
00000030 00002218 R_ARM_GOTOFF32 00000004 __dyninfo0000
0000003c 0000280a R_ARM_THM_CALL 00000025 printf
00000040 0000330a R_ARM_THM_CALL 00000011 getstruct
0000004e 0000280a R_ARM_THM_CALL 00000025 printf
00000066 0000280a R_ARM_THM_CALL 00000025 printf
00000084 0000280a R_ARM_THM_CALL 00000025 printf
000000a2 0000280a R_ARM_THM_CALL 00000025 printf
000000c0 0000280a R_ARM_THM_CALL 00000025 printf
000000da 0000280a R_ARM_THM_CALL 00000025 printf
000000f8 0000280a R_ARM_THM_CALL 00000025 printf
00000108 0000280a R_ARM_THM_CALL 00000025 printf
00000116 0000280a R_ARM_THM_CALL 00000025 printf
00000134 0000101a R_ARM_GOT_BREL 00000196 .LC3
00000138 0000111a R_ARM_GOT_BREL 000001ab .LC4
0000013c 00000f1a R_ARM_GOT_BREL 00000191 .LC2
00000140 0000121a R_ARM_GOT_BREL 000001c2 .LC5
00000144 0000291a R_ARM_GOT_BREL 00000000 dummy_scalar
00000148 0000131a R_ARM_GOT_BREL 000001d7 .LC6
0000014c 0000141a R_ARM_GOT_BREL 000001ed .LC7
00000150 0000271a R_ARM_GOT_BREL 00000170 dummy_struct
00000154 0000151a R_ARM_GOT_BREL 00000203 .LC8
00000158 0000161a R_ARM_GOT_BREL 00000219 .LC9
0000015c 00000e1a R_ARM_GOT_BREL 0000018c .LC1
00000160 0000301a R_ARM_GOT_BREL 00000001 dummyfunc
00000164 0000171a R_ARM_GOT_BREL 00000232 .LC10
00000168 0000181a R_ARM_GOT_BREL 00000248 .LC11
0000016c 0000191a R_ARM_GOT_BREL 00000260 .LC12
Relocation section '.rel.data' at offset 0x9e4 contains 4 entries:
Offset Info Type Sym.Value Sym. Name
00000004 00000102 R_ARM_ABS32 00000000 .text
00000010 00002902 R_ARM_ABS32 00000000 dummy_scalar
00000014 00002702 R_ARM_ABS32 00000170 dummy_struct
00000018 00003002 R_ARM_ABS32 00000001 dummyfunc
00000018 00003002 R_ARM_ABS32 00000001 dummyfunc
This should be all it takes to set bit 0 in the address of dummyfunc:
case R_ARM_ABS32:
{
*(uint32_t *)addr += sym->st_value;
}
break;
There must be an issue with the sym value as we already should be handing that.
#ifdef ARCH_BIG_ENDIAN
saved = temp = (int32_t) nxflat_swap32(*target);
#else
saved = temp = *target;
#endif
/* Mask and sign extend */
temp &= how_to->src_mask;
temp <<= (32 - how_to->bitsize);
temp >>= (32 - how_to->bitsize);
/* Offset */
temp += (sym_value + rel_section->vma) >> how_to->rightshift;
/* Mask upper bits from rollover */
temp &= how_to->dst_mask;
/* Replace data that was masked */
temp |= saved & (~how_to->dst_mask);
And from the very verbose debug output from ldnxflat
rel 3 : sym [ dummyfunc] s_addr @ 00000018 val 00000000-00000000 rel 00000000 how R_ARM_ABS32
Performing ABS32 link at addr 00000018 [00000000] to sym 'dummyfunc' [00000000]
Original location 0xee9088 is 00000000 rsh 0 sz 2 bit 32 rel 0 smask ffffffff dmask ffffffff off 0
Modified location: 00000000
Sym section .text is CODE
Symbol 'dummyfunc' lies in I-Space
relocs[3]: type: 0 offset: 0000005c
Ok I was able to get this to work, but there seems to be an issue with how we identify symbols as being thumb functions in ldnxflat
This fails because st_info=0x18
so it is only annotated as STT_FUNC
.
if ((((elf_symbol_type *)rel_sym)->internal_elf_sym.st_info & 0x0f) == STT_ARM_TFUNC)
When I hacked these checks to just look for STT_FUNC
, I was able to make the test pass. I am digging in to try and understand why the symbols do not have STT_ARM_TFUNC
.
ABCDF
Registering romdisk
Mounting ROMFS filesystem at target=/mnt/romfs with source=/dev/ram0
****************************************************************************
* Executing errno
****************************************************************************
Wait a bit for test completion
Hello, World on stdout
Hello, World on stderr
We failed to open "aflav-sautga-ay!" errno is 2
****************************************************************************
* Executing hello
****************************************************************************
Wait a bit for test completion
Getting ready to say "Hello, world"
Hello, world!
It has been said.
argc = 1
argv = 0x0x20005130
argv[0] = (0x0x20005138) "<noname>"
argv[1] = 0x0
Goodbye, world!
****************************************************************************
* Executing struct
****************************************************************************
Wait a bit for test completion
Calling getstruct()
getstruct returned 0x20004db0
n = 42 (vs 42) PASS
pn = 0x20004da4 (vs 0x20004da4) PASS
*pn = 87 (vs 87) PASS
ps = 0xdc50 (vs 0xdc50) PASS
ps->n = 117 (vs 117) PASS
pf = 0xdae1 (vs 0xdae1) PASS
Calling mystruct->pf()
In dummyfunc() -- PASS
Exit-ing
End-of-Test.. Exit-ing
Ok I was able to get this to work, but there seems to be an issue with how we identify symbols as being thumb functions in ldnxflat This fails because
st_info=0x18
so it is only annotated asSTT_FUNC
.
Interesting. But raises more question. None of the supported ARMv7-M do now, but it is possible that they could support both ARM and Thumb instruction sets (hence the need for distinction between the two). ARMv7-A certainly does support both ISAs. If Thumb functions are labled STT_FUNC that I don't see how that could work.
None of this is unique to NxFLAT. The only place NxFLAT has an effect is in binding to base FLASH code. So why does apps/examples/elf/tests/struct not show the same problem. Something is different. I think you have identified the root cause of the problem, but this doesn't feel like the fix.
Ok I think I have tracked the issue down. Apparently we have not been using the correct way of determining if the branch type is thumb or not, and should not be relying on STT_ARM_TFUNC
. There were new macros added in 2016 for accessing this information (and before that ARM_SYM_BRANCH_TYPE
was the correct macro for much longer ago than that):
#define NUM_ENUM_ARM_ST_BRANCH_TYPE_BITS 2
#define ENUM_ARM_ST_BRANCH_TYPE_BITMASK \
((1 << NUM_ENUM_ARM_ST_BRANCH_TYPE_BITS) - 1)
#define ARM_GET_SYM_BRANCH_TYPE(STI) \
((enum arm_st_branch_type) ((STI) & ENUM_ARM_ST_BRANCH_TYPE_BITMASK))
is_thumb =
((ARM_GET_SYM_BRANCH_TYPE (isym->st_target_internal)
== ST_BRANCH_TO_THUMB) || type == STT_ARM_16BIT);
vs
((isym->st_info & 0x0f) == STT_ARM_TFUNC || (isym->st_info & 0x0f) == STT_ARM_16BIT)
I will clean this up and provide a patch.
@a-lunev can you try this patch? It should apply to the buildroot project without much issue either. https://github.com/btashton/nxflat/pull/2
Hi @btashton,
I applied the patch against e7659eb89e1e7c8729d4cb526117c862d9511922 of https://bitbucket.org/nuttx/buildroot.git and tried to build it using config/cortexm3-eabi-defconfig-7.4.0.
I added #include "config.h"
line to eliminate the following error:
make[1]: Entering directory 'TEST_ROOT/buildroot/toolchain/nxflat'
gcc -c -Wall -I. -I TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1-build/bfd -I TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1/include -o ldnxflat.o ldnxflat.c
In file included from ldnxflat.c:79:
TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1-build/bfd/bfd.h:35:2: error: #error config.h must be included before this header
#error config.h must be included before this header
^~~~~
Then I tested lm3s6965-ek:qemu-nxflat (from https://github.com/apache/incubator-nuttx/pull/3763). The issue with hard fault is resolved. Thank you!
Concerning NXFLAT Toolchain repo, will the new place be https://github.com/btashton/nxflat or stay https://bitbucket.org/nuttx/buildroot.git ?
Hi @btashton,
I applied the patch against e7659eb89e1e7c8729d4cb526117c862d9511922 of https://bitbucket.org/nuttx/buildroot.git and tried to build it using config/cortexm3-eabi-defconfig-7.4.0.
I added
#include "config.h"
line to eliminate the following error:make[1]: Entering directory 'TEST_ROOT/buildroot/toolchain/nxflat' gcc -c -Wall -I. -I TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1-build/bfd -I TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1/include -o ldnxflat.o ldnxflat.c In file included from ldnxflat.c:79: TEST_ROOT/buildroot/toolchain_build_arm_nofpu/binutils-2.28.1-build/bfd/bfd.h:35:2: error: #error config.h must be included before this header #error config.h must be included before this header ^~~~~
Yes that was patched differently in the buildroot repo. It expects that you provide the PACKAGE and PACKAGE_VERSION defines or include the config.h. those defines are normally set by whatever is consuming the headers so I provide them in the CFLAGS in the makefile.
Hello, Unfortunately, I can not build eagle100:nxflat and eagle100:thttpd configurations. As I've understood from README files and history files in NuttX repo, there are some not yet resolved issues with new gcc versions, and gcc 4.3.3 was the last version that still worked for NXFLAT mode. I tried to build NXFLAT Toolchain based on gcc 4.3.3 and binutils 2.19.1, however I'm still experiencing NuttX build errors.
Steps to reproduce:
Build NXFLAT Toolchain:
Build NuttX:
There are multiple build errors. The first portion is as follows:
Could you please tell me if I'm doing something wrong or what https://bitbucket.org/nuttx/buildroot.git SHA-1 (including what gcc and binutils version) and NuttX SHA-1 are compatible to each other to make NuttX with enabled NXFLAT working?