wb2osz / direwolf

Dire Wolf is a software "soundcard" AX.25 packet modem/TNC and APRS encoder/decoder. It can be used stand-alone to observe APRS traffic, as a tracker, digipeater, APRStt gateway, or Internet Gateway (IGate). For more information, look at the bottom 1/4 of this page and in https://github.com/wb2osz/direwolf/blob/dev/doc/README.md
GNU General Public License v2.0
1.58k stars 306 forks source link

Segmentation Fault when following Bluetooth-KISS-TNC.pdf on Pi 0w #135

Open n0mo opened 6 years ago

n0mo commented 6 years ago

Running a Pi 0w, I get a Segmentation fault shortly after starting direwolf:

pi@n0mo_tnc_01:~ $ direwolf Dire Wolf version 1.5 (Feb 24 2018) Beta Test 2 Includes optional support for: gpsd

Reading config file direwolf.conf Audio device for both receive and transmit: plughw:1,0 (channel 0) Channel 0: 1200 baud, AFSK 1200 & 2200 Hz, E+, 44100 sample rate / 3. Will be checking periodically for /dev/rfcomm0 Ready to accept KISS TCP client application 0 on port 8001 ... Ready to accept AGW client application 0 on port 8000 ... Segmentation fault

I have tried commenting out SERIALKISSPOLL /dev/rfcomm0, but the issue remains.

n0mo commented 6 years ago

direwolf.conf.txt Added txt extension for upload only.

dranch commented 6 years ago

Please enable core dumps on your Rpi ZeroW and send a backtrace to us:

https://blog.xojo.com/2015/08/17/take-a-core-dump-what-to-do-when-your-app-crashes-on-linux/

--David KI6ZHD

n0mo commented 6 years ago

Attached. core.gz

dranch commented 6 years ago

Please just provide the text from the gdb backtrace.. not the core itself

n0mo commented 6 years ago

(gdb) bt

0 0x00024458 in hdlc_rec_bit ()

1 0x00014820 in demod_afsk_process_sample ()

2 0x00000000 in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

na7q commented 6 years ago

Getting the same issue here on my pi zero with the dev 1.5 version.

dranch commented 6 years ago

I had forgot to ask before but can you recompile direwolf with debugging objects (add the use -g at the make line) and when cordumps, again do the backtrace and we should see more information than just "??".

cniesen commented 6 years ago

I have the same/similar issue with 1.5b2. 1.4 (release) is working fine. Here are the outputs:

(dev branch at commit 182713f423bbb10c6db529dbf2212d0da2fd11a2 :)

pi@aprs-igate:~ $ rtl_fm -f 144.39M - | direwolf -c sdr.conf -r 24000 -D 1 -t 0 -
Dire Wolf version 1.5 (Mar 31 2018) Beta Test 2

Reading config file sdr.conf
Audio input device for receive: stdin  (channel 0)
Audio out device for transmit: null  (channel 0)
Found 1 device(s):
Channel 0: 1200 baud, AFSK 1200 & 2200 Hz, E+, 24000 sample rate.
Note: PTT not configured for channel 0. (Ignore this if using VOX.)
Ready to accept AGW client application 0 on port 8000 ...
Ready to accept KISS TCP client application 0 on port 8001 ...
  0:  Realtek, RTL2838UHIDIR, SN: 00000001

Using device 0: Generic RTL2832U OEM
Found Rafael Micro R820T tuner
Tuner gain set to automatic.
Tuned to 144642000 Hz.
Oversampling input by: 42x.
Oversampling output by: 1x.
Buffer size: 8.13ms
Exact sample rate is: 1008000.009613 Hz
Sampling at 1008000 S/s.
Output at 24000 Hz.
Signal caught, exiting!

User cancel, exiting...
Segmentation fault
pi@aprs-igate:~ $

(master branch at commit 23ea24641deefb67f9f5c3181f012ad7bfe2b287 :)

pi@aprs-igate:~/direwolf $ rtl_fm -f 144.39M - | direwolf -c sdr.conf -r 24000 -D 1 -t 0 -
Dire Wolf version 1.4

Reading config file sdr.conf
Audio input device for receive: stdin  (channel 0)
Audio out device for transmit: null  (channel 0)
Channel 0: 1200 baud, AFSK 1200 & 2200 Hz, E+, 24000 sample rate.
Note: PTT not configured for channel 0. (Ignore this if using VOX.)
Found 1 device(s):
Use -p command line option to enable KISS pseudo terminal.
Ready to accept AGW client application 0 on port 8000 ...
Ready to accept KISS client application on port 8001 ...
  0:  Realtek, RTL2838UHIDIR, SN: 00000001

Using device 0: Generic RTL2832U OEM
Found Rafael Micro R820T tuner
Tuner gain set to automatic.
Tuned to 144642000 Hz.
Oversampling input by: 42x.
Oversampling output by: 1x.
Buffer size: 8.13ms
Exact sample rate is: 1008000.009613 Hz
Sampling at 1008000 S/s.
Output at 24000 Hz.
Connect to IGate server noam.aprs2.net (2607:7c80:54:1::21) failed.

Connect to IGate server noam.aprs2.net (199.167.130.10) failed.

Connect to IGate server noam.aprs2.net (205.206.140.4) failed.

Connect to IGate server noam.aprs2.net (165.138.206.199) failed.

Connect to IGate server noam.aprs2.net (45.79.213.91) failed.

Connect to IGate server noam.aprs2.net (2001:56a:f326:af00:97f3:6b71:732:9e3c) failed.

Connect to IGate server noam.aprs2.net (100.16.163.97) failed.

Connect to IGate server noam.aprs2.net (44.24.241.98) failed.

Connect to IGate server noam.aprs2.net (208.94.241.11) failed.

Connect to IGate server noam.aprs2.net (74.208.165.54) failed.

Now connected to IGate server noam.aprs2.net (205.206.140.4)
Check server status here http://205.206.140.4:14501

[ig] # aprsc 2.1.2-gc90ee9c
[ig] # logresp xxx unverified, server T2EDM

Sorry, I would need some "hand holding" to compile it with the debugging objects and gdb (-g isn't an option on make).

dranch commented 6 years ago

I assume you built your own version of Direwolf before. If that's not true, let me know and I can add more details. Anyway, I would first recommend to uninstall your previous version of Direwolf to remove any stray binaries (don't forget to make a backup of your direwolf.conf). Next, when you untar and prepare to build direwolf 1.5B2, edit the Makefile.linux file. In that file, find the line:

CFLAGS += -O3 -pthread -Igeotranz -D_XOPEN_SOURCE=600 -D_DEFAULT_SOURCE=1 -Wall

and change it to read

CFLAGS += -O0 -g -pthread -Igeotranz -D_XOPEN_SOURCE=600 -D_DEFAULT_SOURCE=1 -Wall

(Changing the -O optimization to be zero and enable debugging objects with -g).

Now build Direwolf with:

make -f Makefile.linux

The resulting Direwolf should have debugging objects in it now. Next, you can turn on core dumps in your Raspberry Pi with running the following at the command line:

ulimit -c 0

Finally, go ahead and running your original command line. When Direwolf crashes, find the core file (if should be in the current directory):

ls -la core.*

Let's assume it's named "core.1234". We also need to to know where the direwolf binary is. Run the next command to see where your direwolf binary is:

whereis direwolf direwolf: /usr/local/bin/direwolf

Now run the gdb debugger to see what's having an issue:

gdb -c core.1234 /usr/local/bin/direwolf

Once the program logs up, run the command "bt" (for backtrace) and send the results back to this Github issue.

--David KI6ZHD

nojronatron commented 6 years ago

I'm running DW 1.4 on Stretch on an RPi 2 B+. I'm returning to Direwolf after a vacation, but last time it was on Jessie and DW 1.3 (before I had any sort of handle on what I was doing).

I've run into "Segmentation Fault" and unexpected exit when monitoring a frequency with APRS traffic, and I have CDIGIPEAT 0 0 set in the config.

Testing with CDIGIPEAT 0 0 on a frequency with APRS traffic, with or without DIGIPEAT configured, consistently causes Direwolf exit with Segmentation Fault, almost always after the 1st APRS packet is heard.

Testing with CDIGIPEAT 0 0 on a frequency without APRS traffic, with or without DIGIPEAT configured, does NOT cause Segmentation Fault nor an abrupt exit (after an hour or so of observation).

Testing with CDIGIPEAT commented-out, with or without DIGIPEAT enabled on any frequency, does NOT cause Segmentation Fault nor an abrupt exit (after some time of observation).

I've noticed the same behavior with the 1.4 beta & 1.5-beta and beta2 after clean-up and rebuild.

Just out of curiosity if it matters: Where should CDIGIPEAT be put in the .config? I've added it a few lines below a valid DIGIPEAT entry (with arguments as listed in the instruction manual).

dranch commented 6 years ago

There was a reported issue with CDIGIPEAT that should have been fixed in Direwolf v1.5Beta2:

On CDIGIPEAT according to the doc, MYCALL is implied so that CDIGIPEAT 0 0 should just work however it causes a SEGFAULT (prevously discussed). I needed to use , and CDIGIPEAT needs something in the 3rd parameter to filter on:

CDIGIPEAT 0 0 ^VK1RGI-1$

Please give that a try and let us know if it fixes your issue.

--David KI6ZHD

nojronatron commented 6 years ago

Seems to be working, thanks David!