Lora-net / sx1302_hal

SX1302/SX1303 Hardware Abstraction Layer and Tools (packet forwarder...)
Other
244 stars 287 forks source link

sx1302_hal's packet forwarder facing Segmentation Fault with Raspberry Pi (OpenWRT OS) + Semtech LoRa Corecell interface board via USB interface #115

Open mukeshbharath opened 10 months ago

mukeshbharath commented 10 months ago

I'm using a Raspberry Pi 3B+ board with OpenWRT OS (v23.05.2) and trying to use a Semtech SX1302 LoRa Corecell interface board via USB interface to use them as a LoRa Gateway setup (RPi + Semtech SX1302).

While using the sx1302_hal's Lora packet forwarder, it always ends up in segmentation fault when reaching the uplink thread thread_up()

root@OpenWrt:~/sx1302_hal/bin# ./lora_pkt_fwd -c global_conf.json.sx1250.US915.USB
*** Packet Forwarder ***
Version: 2.1.0
*** SX1302 HAL library version info ***
Version: 2.1.0;
***
INFO: Little endian host
INFO: found configuration file global_conf.json.sx1250.US915.USB, parsing it
INFO: global_conf.json.sx1250.US915.USB does contain a JSON object named SX130x_conf, parsing SX1302 parameters
INFO: com_type USB, com_path /dev/ttyACM0, lorawan_public 1, clksrc 0, full_duplex 0
INFO: antenna_gain 0 dBi
INFO: Configuring legacy timestamp
INFO: SX1261 spi_path is not configured in global_conf.json.sx1250.US915.USB
INFO: Configuring Tx Gain LUT for rf_chain 0 with 16 indexes for sx1250
INFO: radio 0 enabled (type SX1250), center frequency 904300000, RSSI offset -215.399994, tx enabled 1, single input mode 0
INFO: radio 1 enabled (type SX1250), center frequency 905000000, RSSI offset -215.399994, tx enabled 0, single input mode 0
INFO: Lora multi-SF channel 0>  radio 0, IF -400000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 1>  radio 0, IF -200000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 2>  radio 0, IF 0 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 3>  radio 0, IF 200000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 4>  radio 1, IF -300000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 5>  radio 1, IF -100000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 6>  radio 1, IF 100000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora multi-SF channel 7>  radio 1, IF 300000 Hz, 125 kHz bw, SF 5 to 12
INFO: Lora std channel> radio 0, IF 300000 Hz, 500000 Hz bw, SF 8, Explicit header
INFO: FSK channel 8 disabled
INFO: global_conf.json.sx1250.US915.USB does contain a JSON object named gateway_conf, parsing gateway parameters
INFO: gateway MAC address is configured to 0016C001F1007684
INFO: server hostname or IP address is configured to "nam1.cloud.thethings.network"
INFO: upstream port is configured to "1700"
INFO: downstream port is configured to "1700"
INFO: downstream keep-alive interval is configured to 10 seconds
INFO: statistics display interval is configured to 30 seconds
INFO: upstream PUSH_DATA time-out is configured to 100 ms
INFO: packets received with a valid CRC will be forwarded
INFO: packets received with a CRC error will NOT be forwarded
INFO: packets received with no CRC will NOT be forwarded
INFO: GPS serial port path is configured to "/dev/ttyS0"
INFO: Reference latitude is configured to 0.000000 deg
INFO: Reference longitude is configured to 0.000000 deg
INFO: Reference altitude is configured to 0 meters
INFO: Beaconing period is configured to 0 seconds
INFO: Beaconing signal will be emitted at 869525000 Hz
INFO: Beaconing datarate is set to SF9
INFO: Beaconing modulation bandwidth is set to 125000Hz
INFO: Beaconing TX power is set to 14dBm
INFO: Beaconing information descriptor is set to 0
INFO: global_conf.json.sx1250.US915.USB does contain a JSON object named debug_conf, parsing debug parameters
INFO: got 2 debug reference payload
INFO: reference payload ID 0 is 0xCAFE1234
INFO: reference payload ID 1 is 0xCAFE2345
INFO: setting debug log file name to loragw_hal.log
WARNING: [main] impossible to open /dev/ttyS0 for GPS sync (check permissions)
Opening USB communication interface
INFO: Configuring TTY
INFO: Flushing TTY
INFO: Setting TTY in blocking mode
INFO: Connect to MCU
INFO: Concentrator MCU version is V01.00.00
INFO: MCU status: sys_time:15110631 temperature:30.5oC
Note: chip version is 0x10 (v1.0)
INFO: using legacy timestamp
INFO: LoRa Service modem: configuring preamble size to 8 symbols
ARB: dual demodulation disabled for all SF
INFO: [main] concentrator started, packet can now be received
INFO: concentrator EUI: 0x0016c001f1007684
Segmentation fault

My Observations:

  1. It seems it is having trouble using the buff_up of the thread_up ()
  2. The same setup is working fine with Debian OS. only with OpenWRT, I'm facing this issue.
  3. The library files libpthread.a, librt.a in openwrt doesn't seem to be valid after installed via opkg install as far as I have noticed. for e.g.,
    root@OpenWrt:~/sx1302_hal/bin# find / -name libpthread.a
    /root/SDK/openwrt-sdk-armsr-armv7_gcc-12.3.0_musl_eabi.Linux-x86_64/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-12.3.0_musl_eabi/lib/libpthread.a
    /usr/lib/libpthread.a
    root@OpenWrt:~/sx1302_hal/bin# cat /root/SDK/openwrt-sdk-armsr-armv7_gcc-12.3.0_musl_eabi.Linux-x86_64/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-12
    .3.0_musl_eabi/lib/libpthread.a
    !<arch>                                                    <------------------
    root@OpenWrt:~/sx1302_hal/bin#
    root@OpenWrt:~/sx1302_hal/bin# cat /usr/lib/libpthread.a
    !<arch>                                                    <------------------
    root@OpenWrt:~/sx1302_hal/bin#
    root@OpenWrt:~/sx1302_hal/bin# find / -name librt.a
    /root/SDK/openwrt-sdk-armsr-armv7_gcc-12.3.0_musl_eabi.Linux-x86_64/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-12.3.0_musl_eabi/lib/librt.a
    /usr/lib/librt.a
    root@OpenWrt:~/sx1302_hal/bin# cat /root/SDK/openwrt-sdk-armsr-armv7_gcc-12.3.0_musl_eabi.Linux-x86_64/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-12
    .3.0_musl_eabi/lib/librt.a
    !<arch>                                                    <------------------
    root@OpenWrt:~/sx1302_hal/bin# cat /usr/lib/librt.a
    !<arch>                                                    <------------------
    root@OpenWrt:~/sx1302_hal/bin#
    root@OpenWrt:~/sx1302_hal/bin#

Please share if you have any insights about this. Any help is greatly appreciated!

Benjie56 commented 9 months ago

I'm getting a Segmentation fault that looks the same (same place in the log files). I'm trying to install on Void Linux (musl).

libpthread.a (and presumably librt.a) is supposedly supposed to be empty, as it is built into musl (https://forum.openwrt.org/t/how-to-get-libpthread/116587).

I'm going to try again with the non-musl version of Void...

Edit: I can confirm the issue is resolved when using glibc instead of musl.

mgrubertirol commented 2 months ago

My fix for musl was move uint8_t buff_up outside the thread function in global space.

cb-xiong commented 2 months ago

My fix for musl was move uint8_t buff_up outside the thread function in global space.

I got the same segmentation fault issue with musl, and it seems the issue was fixed after moving "uint8_t buff_up" outside the thread function in global space. Could you share more detail about the root cause?

mukeshbharath commented 2 months ago

Hi @cb-xiong, the reason is that the thread function's (thread_up()) stack memory is overflowing when buff_up is used inside the thread since it consumes more memory. the thread couldn't handle it and led to the segmentation fault, so placing the buff_up globally (i.e., out of the thread function), helps avoid the seg fault.

cb-xiong commented 2 months ago

Hi @cb-xiong, the reason is that the thread function's (thread_up()) stack memory is overflowing when buff_up is used inside the thread since it consumes more memory. the thread couldn't handle it and led to the segmentation fault, so placing the buff_up globally (i.e., out of the thread function), helps avoid the seg fault.

Thank you @mukeshbharath for the comment, it helps a lot.

cpatulea commented 1 month ago

Increase the thread stack size like this (rough code, needs cleanup):

commit 264b355d058f4e66bac53faf570e5c2c04a9e7a8
Author: Catalin Patulea <cronos586@gmail.com>
Date:   Sun Oct 6 15:50:38 2024 -0400

    Increase stack size for thread_up, otherwise Segfault.

diff --git a/packet_forwarder/src/lora_pkt_fwd.c b/packet_forwarder/src/lora_pkt_fwd.c
index 53661de..48dbe01 100644
--- a/packet_forwarder/src/lora_pkt_fwd.c
+++ b/packet_forwarder/src/lora_pkt_fwd.c
@@ -1670,8 +1670,24 @@ int main(int argc, char ** argv)
         printf("INFO: concentrator EUI: 0x%016" PRIx64 "\n", eui);
     }

+   pthread_attr_t attr;
+   pthread_t      thid;
+
+   i = pthread_attr_init(&attr);
+   if (i == -1) {
+      perror("error in pthread_attr_init");
+      exit(1);
+   }
+
+   int s1 = 409600;
+   i = pthread_attr_setstacksize(&attr, s1);
+   if (i == -1) {
+      perror("error in pthread_attr_setstacksize");
+      exit(2);
+   }
+
     /* spawn threads to manage upstream and downstream */
-    i = pthread_create(&thrid_up, NULL, (void * (*)(void *))thread_up, NULL);
+    i = pthread_create(&thrid_up, &attr, (void * (*)(void *))thread_up, NULL);
     if (i != 0) {
         MSG("ERROR: [main] impossible to create upstream thread\n");
         exit(EXIT_FAILURE);