fractalclone / zephyr-riscv

Zephyr port to riscv architecture
Apache License 2.0
24 stars 10 forks source link

hifive1: Exception cause Load address misaligned #5

Closed jensschroer closed 7 years ago

jensschroer commented 7 years ago

I am trying to get the samples/net/dhcpv4_client/ working on hifive1 board, using the branch provided by @palmer-dabbelt .

I use the following for the prj_hifive1.conf file (copied from arduino 101 disabled a few things for debugging):

CONFIG_NETWORKING=y
CONFIG_NET_IPV6=n
CONFIG_NET_IPV4=y
CONFIG_NET_ARP=y
CONFIG_NET_UDP=y
CONFIG_NET_DHCPV4=y
CONFIG_NET_BUF=y
CONFIG_NET_NBUF_RX_COUNT=4
CONFIG_NET_NBUF_TX_COUNT=4
CONFIG_NET_NBUF_RX_DATA_COUNT=5
CONFIG_NET_NBUF_TX_DATA_COUNT=5
CONFIG_TEST_RANDOM_GENERATOR=y
CONFIG_NET_IF_UNICAST_IPV4_ADDR_COUNT=1

CONFIG_NET_LOG=y
CONFIG_SYS_LOG=y
CONFIG_SYS_LOG_SHOW_COLOR=y
CONFIG_NET_DEBUG_CORE=y
CONFIG_NET_DEBUG_IF=y
CONFIG_NET_DEBUG_DHCPV4=y
CONFIG_NET_DEBUG_ARP=y
CONFIG_NET_DEBUG_L2_ETHERNET=y
CONFIG_NET_DEBUG_IPV4=y
CONFIG_NET_DEBUG_ICMPV4=y
CONFIG_NET_DEBUG_UDP=y
CONFIG_NET_RX_STACK_SIZE=2048

CONFIG_NET_L2_ETHERNET=y

CONFIG_SPI=y
CONFIG_SPI_CS_GPIO=y
CONFIG_SPI_1_CS_GPIO_PORT="GPIO_0"
CONFIG_SPI_1_CS_GPIO_PIN=0

CONFIG_SYS_LOG_ETHERNET_LEVEL=1
CONFIG_ETH_ENC28J60=n
CONFIG_ETH_ENC28J60_0=n
CONFIG_ETH_ENC28J60_0_SPI_PORT_NAME="SPI_1"
CONFIG_ETH_ENC28J60_0_SPI_BUS_FREQ=2
CONFIG_ETH_ENC28J60_0_MAC3=0x0a
CONFIG_ETH_ENC28J60_0_MAC4=0x0b
CONFIG_ETH_ENC28J60_0_MAC5=0x0c

CONFIG_NET_MGMT=y
CONFIG_NET_MGMT_EVENT=y

Compiling the code with the current zephyr SDK and uploading it onto the board using the freedom-e-sdk works fine.

However, when opening ttyUSB1 with screen I get the following error:

***** BOOTING ZEPHYR OS v1.7.99 *****
[dhcpv4] [INF] main: In main
[dhcpv4] [INF] main_thread: Run dhcpv4 client
Exception cause Load address misaligned (4)
Current thread ID = 0x80003070
Faulting instruction address = 0x20404e18
  ra: 0x2040520c  gp: 0x67e09182  tp: 0x20400260  t0: 0x1800
  t1: 0x20400780  t2: 0x0  t3: 0x2547316e  t4: 0x20408468
  t5: 0x9d040577  t6: 0x177fe4b9  a0: 0x80000144  a1: 0x1
  a2: 0x0  a3: 0x80003414  a4: 0x0  a5: 0x3b9aca0d
  a6: 0x8120d47e  a7: 0x67e09182
Fatal fault in thread! Aborting.

Disassembling the code shows:

20404e0c <prepare_message>:
20404e0c:       00852783                lw      a5,8(a0)
20404e10:       fd010113                addi    sp,sp,-48
20404e14:       02112623                sw      ra,44(sp)
20404e18:       0087a783                lw      a5,8(a5)
20404e1c:       02812423                sw      s0,40(sp)
20404e20:       02912223                sw      s1,36(sp)
20404e24:       03212023                sw      s2,32(sp)
20404e28:       01312e23                sw      s3,28(sp)
20404e2c:       01412c23                sw      s4,24(sp)
20404e30:       01512a23                sw      s5,20(sp)
20404e34:       00058a93                mv      s5,a1
20404e38:       00000593                li      a1,0
20404e3c:       00060a13                mv      s4,a2
20404e40:       00050913                mv      s2,a0
20404e44:       000780e7                jalr    a5
20404e48:       fff00593                li      a1,-1
20404e4c:       bd9fd0ef                jal     20402a24 <net_nbuf_get_reserve_tx>

The function prepare_message is in subsys/net/ip/dhcpv4.c

I found an old and closed bug in JIRA, just linking it here for reference as it sounded similiar: https://jira.zephyrproject.org/browse/ZEP-955

Not sure how to proceed at this point. Appreciate any hints / advice.

palmer-dabbelt commented 7 years ago

Are you passing the "-mstrict-align" option to GCC? If not, then GCC is allowed to emit unaligned accesses (they're marked as slow, not illegal). If your compiler is too old to understand that option then you should be safe, but you might want to update to the latest release as we've fixed a handful of bugs.

jensschroer commented 7 years ago

I am using the Zephyr 0.9.1 SDK. It contains gcc version 6.1.0 Tried to add the -mstrict-align option but the compiler does not seem to know the option.

For completeness, gcc -v output

> /opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/riscv32-zephyr-elf-gcc -v
Using built-in specs.
COLLECT_GCC=/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/riscv32-zephyr-elf-gcc
COLLECT_LTO_WRAPPER=/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/lto-wrapper
Target: riscv32-zephyr-elf
Configured with: ../../../../../../work-shared/gcc-6.x.riscv32-r0/git/configure --build=x86_64-linux --host=x86_64-pokysdk-linux --target=riscv32-zephyr-elf --prefix=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr --exec_prefix=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr --bindir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf --sbindir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf --libexecdir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/libexec/riscv32-zephyr-elf --datadir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/share --sysconfdir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/etc --sharedstatedir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/com --localstatedir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/var --libdir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/lib/riscv32-zephyr-elf --includedir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/include --oldincludedir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/include --infodir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/share/info --mandir=/opt/zephyr-sdk/2.2/sysroots/x86_64-pokysdk-linux/usr/share/man --disable-silent-rules --disable-dependency-tracking --with-libtool-sysroot=/workdir/poky/build-zephyr-riscv32/tmp/sysroots/x86_64-nativesdk-pokysdk-linux --with-gnu-ld --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-multilib --enable-c99 --enable-long-long --enable-symvers=gnu --enable-libstdcxx-pch --program-prefix=riscv32-zephyr-elf- --without-local-prefix --enable-lto --enable-libssp --disable-bootstrap --disable-libmudflap --with-system-zlib --enable-linker-build-id --with-ppl=no --with-cloog=no --enable-checking=release --enable-cheaders=c_global --with-gxx-include-dir=/not/exist/usr/include/c++/6.1.0 --with-build-time-tools=/workdir/poky/build-zephyr-riscv32/tmp/sysroots/x86_64-linux/usr/riscv32-zephyr-elf/bin --without-long-double-128 --enable-poison-system-directories --with-mpfr=/workdir/poky/build-zephyr-riscv32/tmp/sysroots/x86_64-nativesdk-pokysdk-linux --with-mpc=/workdir/poky/build-zephyr-riscv32/tmp/sysroots/x86_64-nativesdk-pokysdk-linux --disable-static --enable-nls --enable-initfini-array --without-headers --without-headers --enable-plugin --enable-plugin
Thread model: posix
gcc version 6.1.0 (GCC)
jensschroer commented 7 years ago

@palmer-dabbelt I have also tried now the latest gcc from riscv toolchain (version 7.1.0) but still get the memory misaligned issue, even using the -mstrict-align option. Any suggestions?

jensschroer commented 7 years ago

/opt/riscv/bin/riscv32-unknown-elf-gcc -v Using built-in specs. COLLECT_GCC=/opt/riscv/bin/riscv32-unknown-elf-gcc COLLECT_LTO_WRAPPER=/opt/riscv/libexec/gcc/riscv32-unknown-elf/7.1.1/lto-wrapper Target: riscv32-unknown-elf Configured with: /home/schroer/code/riscv-gnu-toolchain/riscv-gcc/configure --target=riscv32-unknown-elf --prefix=/opt/riscv --disable-shared --disable-threads --enable-languages=c,c++ --with-system-zlib --enable-tls --with-newlib --with-headers=/opt/riscv/riscv32-unknown-elf/include --disable-libmudflap --disable-libssp --disable-libquadmath --disable-libgomp --disable-nls --enable-checking=yes --disable-multilib --with-abi=ilp32d --with-arch=rv32g 'CFLAGS_FOR_TARGET=-Os -mcmodel=medlow' Thread model: single gcc version 7.1.1 20170509 (GCC)

fractalclone commented 7 years ago

Can you try compiling the app without optimization? Updating the prj_hifive1.conf with CONFIG_DEBUG=y should do the trick.

I'm also observing some issues with crypto tests running on riscv32 platform, when compiling with default optimization options as set by the zephyr build system, the tests fail with wrong computation results. Without optimization the tests pass. On other platforms this issue is not observed. I've also updated to latest gcc and still same result.

jensschroer commented 7 years ago

The same issue with CONFIG_DEBUG=y

fractalclone commented 7 years ago

@jensschroer Can you try with latest commits from git@github.com:fractalclone/zephyr-riscv.git ? There was a bug in context-restore upon exiting from ISR and this was causing execution errors from time to time. This was notably the origin of the issues I've observed while running crypto tests.

However, you will have to use latest zephyr 0.9.1 SDK since I have rebased zephyr-riscv repo from zephyr mainline repo. Link can be obtained from zephyr-riscv README.md

jensschroer commented 7 years ago

Alright, found finally out the root cause of this issue. It is caused by two options in the .config: CONFIG_NET_L2_ETHERNET=y and CONFIG_ETH_ENC28J60=n

The IP stack start up is trying to load a L2 interface, but since none is configured it tries access an invalid memory address, which results in above error.

@fractalclone thank you for the help. I will close this issue.