edison-fw / meta-intel-edison

Here is the meta-intel-edison that builds, tries to stay up to date. Master is based on Yocto Poky Gatesgarth LTS 5.10.yy vanilla kernels. It builds a 32bit kernel (Gatesgarth branch 64bit) with ACPI enabled and corresponding rootfs. Telegram group: https://t.me/IntelEdison Web-site:
https://edison-fw.github.io/meta-intel-edison/
MIT License
60 stars 37 forks source link

mraa crash on Yocto dunfell #123

Closed mwallnoefer closed 3 years ago

mwallnoefer commented 3 years ago

Successfully tried to compile a little mraa example (https://github.com/eclipse/mraa/blob/master/examples/c%2B%2B/led.cpp) on Yocto dunfell, I am getting a SIGSEGV crash on start.

Please find the relevant part of the strace log here:

root@edison:/mnt/progs# strace ./mraa_led
...
sendto(3, "<141>Feb 20 16:57:10 libmraa[291"..., 98, MSG_NOSIGNAL, NULL, 0) = 98
openat(AT_FDCWD, "/sys/devices/virtual/dmi/id/board_name", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
read(4, "BODEGA BAY\n", 4096)           = 11
uname({sysname="Linux", nodename="edison", ...}) = 0
getpid()                                = 29116
sendto(3, "<141>Feb 20 16:57:10 libmraa[291"..., 104, MSG_NOSIGNAL, NULL, 0) = 104
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x520} ---
+++ killed by SIGSEGV +++
Segmentation fault

I guess that mraa knows nothing about a board named BODEGA BAY 😄 .

htot commented 3 years ago

It's not supposed to crash :-) But BODEGA BAY is not something new: https://github.com/eclipse/mraa/blob/e15ce6fbc76148ba8835adc92196b0d0a3f245e7/src/x86/x86.c#L50

Can you check syslog? I think it runs at least to here https://github.com/eclipse/mraa/blob/e15ce6fbc76148ba8835adc92196b0d0a3f245e7/src/x86/intel_edison_fab_c.c#L1310.

mwallnoefer commented 3 years ago

Thanks for the quick response. Please find here the log of journalctl:

root@edison:/mnt/progs# journalctl -b -r | head -n15
-- Logs begin at Thu 2020-03-05 12:31:07 CET, end at Sun 2021-02-21 08:27:31 CET. --
Feb 21 08:27:31 edison systemd-journald[460]: Forwarding to syslog missed 3 messages.
Feb 21 08:26:33 edison kernel[544]: [   75.286149] audit: type=1701 audit(1613892393.121:8): auid=0 uid=0 gid=0 ses=2 subj=kernel pid=976 comm="mraa_led" exe="/mnt/progs/mraa_led" sig=11 res=1
Feb 21 08:26:33 edison kernel[544]: [   75.286005] Code: 89 e7 45 31 e4 e8 e3 c5 ff ff eb b4 90 49 c7 44 24 04 ff ff ff ff b8 ff ff ff ff 49 89 44 24 28 48 8b 05 16 b5 02 00 48 8b 00 <8b> 80 20 05 00 00 85 c0 74 29 41 c7 44 24 68 01 00 00 00 49 c7 84
Feb 21 08:26:33 edison kernel[544]: [   75.285966] mraa_led[976]: segfault at 520 ip 00007f8baa3deadd sp 00007fff96975890 error 4 in libmraa.so.2.0.0[7f8baa3db000+21000]
Feb 21 08:26:33 edison kernel: audit: type=1701 audit(1613892393.121:8): auid=0 uid=0 gid=0 ses=2 subj=kernel pid=976 comm="mraa_led" exe="/mnt/progs/mraa_led" sig=11 res=1
Feb 21 08:26:33 edison kernel: Code: 89 e7 45 31 e4 e8 e3 c5 ff ff eb b4 90 49 c7 44 24 04 ff ff ff ff b8 ff ff ff ff 49 89 44 24 28 48 8b 05 16 b5 02 00 48 8b 00 <8b> 80 20 05 00 00 85 c0 74 29 41 c7 44 24 68 01 00 00 00 49 c7 84
Feb 21 08:26:33 edison kernel: mraa_led[976]: segfault at 520 ip 00007f8baa3deadd sp 00007fff96975890 error 4 in libmraa.so.2.0.0[7f8baa3db000+21000]
Feb 21 08:26:33 edison systemd-journald[460]: Forwarding to syslog missed 15 messages.
Feb 21 08:26:33 edison audit[976]: ANOM_ABEND auid=0 uid=0 gid=0 ses=2 subj=kernel pid=976 comm="mraa_led" exe="/mnt/progs/mraa_led" sig=11 res=1
Feb 21 08:26:11 edison systemd[1]: Started Daily apt download activities.
Feb 21 08:26:11 edison systemd[1]: apt-daily.service: Succeeded.
Feb 21 08:26:09 edison systemd[1]: Starting Daily apt download activities...
Feb 21 08:26:09 edison systemd-timesyncd[536]: Synchronized to time server for the first time 216.239.35.8:123 (time3.google.com).
Feb 20 17:06:04 edison kernel[544]: [   51.333270] audit: type=1006 audit(1613837164.471:7): pid=794 uid=0 subj=kernel old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2 res=1
htot commented 3 years ago

Oh sorry, I meant journal probably shows this "edison: Linux version 4 or higher detected, assuming Vanilla kernel" right?

htot commented 3 years ago

BTW how did you build application? Did you build the sdk?

Also, I didn't know mraa supports leds. But we need to define the pin as a led in an acpi table (https://github.com/edison-fw/meta-acpi/blob/dunfell/recipes-bsp/acpi-tables/samples/edison/leds.asli) first.

And gpio support in mraa is currently broken. libgpiod is working, and we could make mraa to use that but upstream doesn't want the dependency.

mwallnoefer commented 3 years ago

I have built it directly on the Edison module, so nothing special at all: g++ -lmraa -omraa_leds mraa_leds.cpp

Yes, I could indeed look at the alternative libgpiod but it would be nice to restore standard-compliency.

htot commented 3 years ago

We need to fix intel_edison_fab.c for that.

htot commented 3 years ago

I just tried to reproduce your steps. For me it doesn't crash, I get an error that the led user1 doesn't exist. That is correct. As you can see under /sys/class/leds/ we have a led called heartbeat (defined in the asli mentioned above), so changing user1 to heartbeat makes it work (in my case):

root@yuna:~# ./led
maximum brightness value for user1 led is: %d255
led trigger set to: heartbeat

But there is a fix that I didn't yet push to this: https://github.com/htot/meta-intel-edison/blob/bca7ee5eaba16969267a0307924ff0637a45314c/meta-intel-edison-distro/recipes-support/blink-led/files/blink-led#L52

You need to remove the # and run blink_led at least once (or the service twice). These pins need initializing to make the led actually blink. For some reason the settings persists after reboot.

In fact this python code does the same as the cpp.

mwallnoefer commented 3 years ago

Changed the led constant into heartbeat and executed your upgraded blink-led script, but the segfault remained. I may paste you the stack-trace but without any debug symbols it is not very readable:

(gdb) bt full
#0  0x00007ffff7f8fadd in ?? () from /usr/lib/libmraa.so.2
No symbol table info available.
#1  0x00007ffff7fa632a in is_arduino_board () from /usr/lib/libmraa.so.2
No symbol table info available.
#2  0x00007ffff7fa670c in mraa_intel_edison_fab_c () from /usr/lib/libmraa.so.2
No symbol table info available.
#3  0x00007ffff7fa1865 in mraa_x86_platform () from /usr/lib/libmraa.so.2
No symbol table info available.
#4  0x00007ffff7f8debf in imraa_init () from /usr/lib/libmraa.so.2
No symbol table info available.
#5  0x00007ffff7fe364a in ?? () from /lib/ld-linux-x86-64.so.2
No symbol table info available.

I hope that I am using the right kernel, please find the uname output here: Linux edison 5.5.0-edison-acpi-standard #1 SMP Sat Mar 14 12:18:36 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

htot commented 3 years ago

I am on 5.11, but it's a bit of a pain. IIRC 5.6 was the last working well with respect to the gadget mode. For led there should not be much difference.

mwallnoefer commented 3 years ago

Okay, so this means that although I have switched to Yocto Dunfell I still run the old 5.5 kernel? This implies that bitbake didn't get the change, interesting...

So there remains nothing left to do a complete rebuild from scratch 😐, just that it takes quite some hours to complete. But yes, it will be the only solution, correct?

htot commented 3 years ago

With respect to mraa not I guess, except making gpio work again.

mwallnoefer commented 3 years ago

Well and how could I get bitbake to build libmraa with debug symbols? So at least then we would know the exact position where it crashes...

htot commented 3 years ago

I don't know. Do they package debug symbols separately?

But seems to be crashing in is_arduino_board(). Strangely, in my case it doesn't.

mwallnoefer commented 3 years ago

Okay, I have found the debug symbols in tmp/deploy/deb/corei7-64/libmraa-dbg_2.0.0+git0+967585c9ea-r0_amd64.deb under the bitbake build.

Now the complete gdb output:

Program received signal SIGSEGV, Segmentation fault.
#0  0x00007ffff7f8fadd in mraa_gpio_init_internal (func_table=0x0, pin=214)
    at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/gpio/gpio.c:117
        status = MRAA_SUCCESS
        dev = 0x55555556c050
#1  0x00007ffff7f8fdbd in mraa_gpio_init_raw (pin=pin@entry=214)
    at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/gpio/gpio.c:446
No locals.
#2  0x00007ffff7fa632a in is_arduino_board ()
    at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/x86/intel_edison_fab_c.c:1264
        gpiochip_path = "\300\333VUUU\000\000\260\256VUUU\000\000\030\244\373\367\377\177\000\000\000\000\000\000\000\000\000\000\260\352\377\377\377\177\000\000\302\303\267\367\377\177\000\000\a\000\000\000\000\000\000\000\030\000\000\000\060\000\000"
        gpiochip_label = "\340\350\377\377\377\177\000\000 \350\377\377\377\177\000\000\354\350\377\377\377\177\000\000\320\360\303\367\377\177\000\000 \006", '\000' <repeats 14 times>, "\231\231\231\231\231\231\231\031\005\000\000\000\000\000\000"
        gpiochip_label_arduino = "pcal9555a"
        gpiochip_idx = {200, 216, 232, 248}
        format_str = "%63s\000\000\000\000\003\000\000\000\000\000\000\000|\000\000\000w\000\000\000n\000\000\000^", '\000' <repeats 11 times>, "@\236\000\000\000\000\000\000\340\351\303\367\377\177\000\000\360\005\000\000\000\000\000"
        i = <optimized out>
        ret = <optimized out>
        errno_saved = <optimized out>
#3  0x00007ffff7fa670c in mraa_intel_edison_fab_c ()
    at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/x86/intel_edison_fab_c.c:1346
        tristate_dir = 4096
        name = {sysname = "Linux", '\000' <repeats 59 times>, nodename = "edison", '\000' <repeats 58 times>, 
          release = "5.5.0-edison-acpi-standard", '\000' <repeats 38 times>, 
          version = "#1 SMP Sat Mar 14 12:18:36 UTC 2020", '\000' <repeats 29 times>, 
          machine = "x86_64", '\000' <repeats 58 times>, __domainname = "home-life.hub", '\000' <repeats 51 times>}
        major = 5
        minor = 5
        release = 0
        ret = <optimized out>
        b = 0x55555556dbc0
        ici = <optimized out>
        il = <optimized out>
#4  0x00007ffff7fa1865 in mraa_x86_platform ()
    at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/x86/x86.c:65
        platform_type = MRAA_INTEL_EDISON_FAB_C
        line = 0x55555556b4d0 "BODEGA BAY"
        len = 120
        fh = 0x55555556b2c0
#5  0x00007ffff7f8debf in imraa_init () at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/mraa.c:165
        env_var = <optimized out>
        platform_type = <optimized out>
        proc_euid = 0
        proc_user = 0x7ffff7c41fc0
        env_var = <optimized out>
        platform_type = <optimized out>
        proc_euid = <optimized out>
        proc_user = <optimized out>
        length = <optimized out>
#6  imraa_init () at /usr/src/debug/mraa/2.0.0+gitAUTOINC+967585c9ea-r0/git/src/mraa.c:128
        env_var = <optimized out>
        platform_type = <optimized out>
        proc_euid = <optimized out>
        proc_user = <optimized out>
        length = <optimized out>
#7  0x00007ffff7fe364a in ?? () from /lib/ld-linux-x86-64.so.2
No symbol table info available.
#8  0x00007ffff7fe3749 in ?? () from /lib/ld-linux-x86-64.so.2
mwallnoefer commented 3 years ago

It is failing here: https://github.com/eclipse/mraa/blob/967585c9ea0e1a8818d2172d2395d8502f6180a2/src/gpio/gpio.c#L117

htot commented 3 years ago

Sure? 967585c corresponds to mraa 2.0.0 which is in Zeus. Dunfell (which I am using now) has 2.1.0 (https://layers.openembedded.org/layerindex/branch/dunfell/recipes/?q=mraa).

If you are using Zeus this commit in 2.1.0 probably fixes the crash https://github.com/eclipse/mraa/commit/9fe2883e6afec0b8b5e8ce432a77476ac737c3f8

This does touch the real problem with gpio in mraa on edison, the kernel now does support chardev but mraa doesn't know.

mwallnoefer commented 3 years ago

Okay, thanks for the exhaustive explanation. No idea why I am still running an outdated libmraa version on my board, will update that one.

I guess that we could close the bug then?

htot commented 3 years ago

Let's see if running Dunfell fixes it. As you can see I didn't test mraa on Zeus, and the example referenced above just now. Who knows what we will find. Interesting to see chardev support appearing in mraa, maybe it's feasible to get it to work on Edison.

htot commented 3 years ago

I see chardev support has been added to Joule, maybe we can use that as an example.

mwallnoefer commented 3 years ago

Ok, I have rebuilt the whole image and this is working now.