DSheirer / sdrtrunk

A cross-platform java application for decoding, monitoring, recording and streaming trunked mobile and related radio protocols using Software Defined Radios (SDR). Website:
GNU General Public License v3.0
1.59k stars 256 forks source link

P25 P1 Channels Stuck in Teardown #1992

Closed tadscottsmith closed 6 days ago

tadscottsmith commented 1 month ago

sdrtrunk Version Currently running commit 130e59f4c16189f4dbae011509108f80965f9928 .

Describe the bug Occasionally, likely due to temporary high resource utilization, a channel will get stuck in Teardown state. As far as I can tell, a channel stuck in this state will never release until the application is restarted. I have seen channels stuck in this state for hours, and even as new calls are received and processed on the channel, the state does not change. It appears that calls received on a channel in Teardown state are not recorded or broadcast.

Expected behavior After a given time period, it would be nice if the application tried to tear down the channel again. If the application sees a new call on the channel that is in "Teardown" state, it should reset the state and begin recording the call.

Screenshots Example of a channel in Teardown state, which is currently receiving and decoding a valid call. The call was not recorded or broadcast. 20240924_164412_screen_capture

Desktop (optional - complete the following information):

Additional context 130e59f4c16189f4dbae011509108f80965f9928

tadscottsmith commented 1 month ago

It appears this may be an expansion of a pretty old issue, #814. I will leave this open with the additional context that in this case calls are still being partially processed on the Teardown channel.

DSheirer commented 1 week ago

@tadscottsmith there is a new menu item under the 'File' menu called "Processing Diagnostic Report". This was added back in February 2024, so any nightly release or the 0.6.1-beta2 release should have the menu option.

The next time that you see a channel stuck in TEARDOWN state, can you click this menu item and then look in the Logs directory and send me the generated report?

marshyonline commented 1 week ago

I've been struggling with this issue now for some time. Attached are the two dumps. image

20241029_002701_sdrtrunk_processing_diagnostic_report.log 20241029_002706_sdrtrunk_thread_dump.log

tadscottsmith commented 1 week ago

Here's an example from my system. I've found it's pretty easy to get channels stuck in this state by mass editing aliases. If I select a couple of thousand of them and try to set them all to mute or unmute, the playlist editor will hang and cause the teardown issue.

20241028_105701_screen_capture 20241028_105708_sdrtrunk_thread_dump.log 20241028_105704_sdrtrunk_processing_diagnostic_report.log

marshyonline commented 1 week ago

I swapped my Channelizer Type out from Polyphase to Heterodyne and I'm no longer seeing TEARDOWN issues. My only concern with Heterodyne is monitoring 3 sites in the same system, it can be quite busy and I'm not sure how it will perform given the text descriptions of each Type

DSheirer commented 1 week ago

@tadscottsmith @marshyonline I built a special version of the beta 2 release for Windows 10 x86-64 that you can access here on my Google Drive: https://drive.google.com/drive/folders/130eCl0Bv-G7RlsBMtIfiP9ZhdCmg_FpL?usp=sharing

If you don't want to use this version, you can checkout branch "1992-channel-stuck-in-teardown' and build your own version via the command line (gradle runtimeZipWindows)

Can you please run this version using the Polyphase Channelizer, and run it until one or more channels get stuck in teardown and then create the diagnostic report like you just did. Please post the diagnostic report(s) and the application log.

There's a chance that you won't get it to a state where channels are stuck in TEARDOWN because I enhanced the error handling in a couple places. If those enhancements are indeed catching the errors, you should see logging to that effect in the application log. In that case, please just send the application log so that I can see where it caught the error(s).

marshyonline commented 1 week ago

@DSheirer Ran the new build, first 2 calls I had got stuck, logs below for you sir 20241029_211737_sdrtrunk_processing_diagnostic_report.log 20241029_211742_sdrtrunk_thread_dump.log sdrtrunk_app.log

EDIT: Had another crack just to make sure I didn't break anything but same result 20241029_212736_sdrtrunk_processing_diagnostic_report.log 20241029_212740_sdrtrunk_thread_dump.log sdrtrunk_app.log

DSheirer commented 1 week ago

@marshyonline @tadscottsmith I made more changes and posted another custom build to my google drive. Can you repeat the process again and post logs? Channels may not get stuck now.

https://drive.google.com/drive/folders/130eCl0Bv-G7RlsBMtIfiP9ZhdCmg_FpL?usp=sharing

marshyonline commented 1 week ago

Winner Winner Chicken Dinner!

So far so good, no chan's stuck in TEARDOWN. I will leave running overnight and report back in the morning Thank you!

DSheirer commented 1 week ago

@marshyonline if you have a chance, can you post your application log. I'm trying to figure out what process upstream was failing that caused this to happen in the first place.

marshyonline commented 1 week ago

from what build- Current Beta Release, Fix 1 or Fix2?

marshyonline commented 1 week ago

Seem to have hit a snag with fix 2 Everything has locked up as per screenshot image 20241029_230517_sdrtrunk_processing_diagnostic_report.log 20241029_230522_sdrtrunk_thread_dump.log sdrtrunk_app.log

DSheirer commented 1 week ago

Can you generate a diagnostic report or is the app fully locked up?

Can you check to see if a diagnostic report was auto generated in the Los directory?

To your recent question, the app log from the latest test build.

marshyonline commented 1 week ago

The Application hadnt locked up no, but it lost all its tuners and was no longer processing any data. I was able to generate the reports and attached below my screenshot :)

tadscottsmith commented 6 days ago

I have not been able to recreate the issue of channels stuck in TEARDOWN with your latest build. It's been running almost 6 hours with no issues.

DSheirer commented 6 days ago

@tadscottsmith can you send your app log once you're done? I added some logging and I'm curious if any of it triggered.

tadscottsmith commented 6 days ago

I am not seeing anything logged, but I'm not able to reproduce the stuck TEARDOWN state that was easily reproduceable yesterday. Here's the logs from a recent run that I did along side a CPU stress tester. I was able to get the Playlist Editor to completely hang for a long period, and could even see the waterfall stutter, but calls continued to clean themselves up and nothing got stuck in TEARDOWN.

The other instance that's been running almost 7 hours now ends with the same ComplexPolyphaseChannelizerM2 log entry.


2024-10-29 16:36:21.264 INFO  i.g.d.icon.IconModel - loading icons file [C:\Users\Radio\SDRTrunk\settings\icons.xml]  [14MB/256MB 5%]
2024-10-29 16:36:21.272 INFO  i.g.d.icon.IconModel - Icons file not found at [C:\Users\Radio\SDRTrunk\settings\icons.xml]  [15MB/256MB 6%]
2024-10-29 16:36:21.840 INFO  i.g.d.log.ApplicationLog - Application Log File: C:\Users\Radio\SDRTrunk\logs\sdrtrunk_app.log  [22MB/256MB 8%]
2024-10-29 16:36:21.937 INFO  i.g.d.log.ApplicationLog - Failed to find build information.  [8MB/256MB 3%]
2024-10-29 16:36:21.938 INFO  i.g.d.log.ApplicationLog -   [8MB/256MB 3%]
2024-10-29 16:36:21.938 INFO  i.g.d.log.ApplicationLog - *******************************************************************  [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO  i.g.d.log.ApplicationLog - **** sdrtrunk: a trunked radio and digital decoding application ***  [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO  i.g.d.log.ApplicationLog - ****  website: https://github.com/dsheirer/sdrtrunk             ***  [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO  i.g.d.log.ApplicationLog - *******************************************************************  [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO  i.g.d.log.ApplicationLog - Memory Logging Format: [Used/Allocated PercentUsed%]  [8MB/256MB 3%]
2024-10-29 16:36:21.940 INFO  i.g.d.log.ApplicationLog - Host OS Name:          Windows 10  [8MB/256MB 3%]
2024-10-29 16:36:21.941 INFO  i.g.d.log.ApplicationLog - Host OS Arch:          amd64  [8MB/256MB 3%]
2024-10-29 16:36:21.942 INFO  i.g.d.log.ApplicationLog - Host OS Version:       10.0  [8MB/256MB 3%]
2024-10-29 16:36:21.943 INFO  i.g.d.log.ApplicationLog - Host CPU Cores:        8  [8MB/256MB 3%]
2024-10-29 16:36:21.943 INFO  i.g.d.log.ApplicationLog - Host Max Java Memory:  3 GB  [8MB/256MB 3%]
2024-10-29 16:36:21.944 INFO  i.g.d.log.ApplicationLog - Storage Directories:  [8MB/256MB 3%]
2024-10-29 16:36:21.945 INFO  i.g.d.log.ApplicationLog -  Application Root: C:\Users\Radio\SDRTrunk  [8MB/256MB 3%]
2024-10-29 16:36:21.945 INFO  i.g.d.log.ApplicationLog -  Application Log:  C:\Users\Radio\SDRTrunk\logs  [8MB/256MB 3%]
2024-10-29 16:36:21.946 INFO  i.g.d.log.ApplicationLog -  Event Log:        C:\Users\Radio\SDRTrunk\event_logs  [8MB/256MB 3%]
2024-10-29 16:36:21.948 INFO  i.g.d.log.ApplicationLog -  Playlist:         C:\Users\Radio\SDRTrunk\playlist  [8MB/256MB 3%]
2024-10-29 16:36:21.948 INFO  i.g.d.log.ApplicationLog -  Recordings:       C:\Users\Radio\SDRTrunk\recordings  [8MB/256MB 3%]
2024-10-29 16:36:21.982 INFO  i.g.d.s.t.s.a.SDRPlayLibraryHelper - SDRPlay API native library not found at: C:\Program Files\SDRplay\API\x64\sdrplay_api  [9MB/256MB 3%]
2024-10-29 16:36:22.002 INFO  i.g.d.util.ThreadPool - Application thread pool created SCHEDULED and CACHED executors threads  [9MB/256MB 3%]
2024-10-29 16:36:22.005 INFO  i.g.d.p.SystemProperties - SystemProperties - loaded [C:\Users\Radio\SDRTrunk\SDRTrunk.properties]  [9MB/256MB 3%]
2024-10-29 16:36:22.467 INFO  i.g.d.s.t.m.TunerManager - Discovering tuners ...  [8MB/256MB 3%]
2024-10-29 16:36:22.550 INFO  i.g.d.s.t.m.TunerManager - LibUsb API Version: 1.0.262  [9MB/256MB 3%]
2024-10-29 16:36:22.551 INFO  i.g.d.s.t.m.TunerManager - LibUsb Version: 1.0.22.11312  [9MB/256MB 3%]
2024-10-29 16:36:22.584 INFO  i.g.d.s.t.m.TunerManager - LibUsb - discovered [11] potential usb devices  [9MB/256MB 3%]
2024-10-29 16:36:22.592 INFO  i.g.d.s.t.m.TunerManager - Discovered tuner at USB Bus [1] Port [1.4] Tuner Class [RTL-2832]  [9MB/256MB 3%]
2024-10-29 16:36:22.598 INFO  i.g.d.s.t.m.TunerManager - Tuner: USB Tuner - RTL-2832 USB Bus:1 Port:1.4 - Added / Starting ...  [9MB/256MB 3%]
2024-10-29 16:36:23.988 INFO  i.g.d.d.f.c.ComplexPolyphaseChannelizerM2 - Sample Rate [2400000.0] providing [96] channels at [25000.0] Hz each  [13MB/256MB 5%]
2024-10-29 16:36:24.143 INFO  i.g.d.s.t.m.TunerManager - LibUsb Hotplug event notification Is Not Supported on this platform.  [14MB/256MB 5%]
2024-10-29 16:36:24.146 INFO  i.g.d.s.t.s.api.SDRplay - API library is not available - unsupported version: 0.0  [14MB/256MB 5%]
2024-10-29 16:36:24.150 INFO  i.g.d.s.t.m.TunerManager - Discovered [2] recording tuners  [14MB/256MB 5%]
2024-10-29 16:36:24.151 INFO  i.g.d.s.t.m.TunerManager - Tuner Added: Recording [C:\Users\Radio\Downloads\sdrpp_windows_x64 (1)\sdrpp_windows_x64\recordings\baseband_852562500Hz_16-57-47_22-10-2024.wav]  [14MB/256MB 5%]
2024-10-29 16:36:24.152 INFO  i.g.d.s.t.m.TunerManager - Tuner Added: Recording [C:\Users\Radio\SDRTrunk\recordings\RTL2832_852562500_baseband_20241022_171854.wav]  [14MB/256MB 5%]
2024-10-29 16:36:24.157 INFO  i.g.d.s.SettingsManager - SettingsManager - loading settings file [C:\Users\Radio\SDRTrunk\settings\settings.xml]  [14MB/256MB 5%]
2024-10-29 16:36:24.324 INFO  i.g.d.m.DiagnosticMonitor - Diagnostic monitoring enabled running every 30 seconds  [11MB/34MB 33%]
2024-10-29 16:36:24.796 WARN  i.g.d.v.VectorUtilities - CPU supports maximum SIMD instructions of Species[float, 4, S_128_BIT]  [14MB/34MB 43%]
2024-10-29 16:36:25.896 INFO  i.g.d.p.PlaylistManager - Loading playlist [C:\Users\Radio\SDRTrunk\playlist\default.xml]  [17MB/34MB 51%]
2024-10-29 16:36:26.888 INFO  i.g.dsheirer.gui.SDRTrunk - starting main application gui  [52MB/120MB 43%]
2024-10-29 16:36:26.890 ERROR i.g.d.p.SystemProperties - Unable to load jar manifest - we're probably not running from a release jar  [52MB/120MB 43%]
2024-10-29 16:37:29.345 INFO  i.g.d.a.c.m.JmbeAudioModule - Loading JMBE library from [C:\Users\Radio\SDRTrunk\jmbe\jmbe-1.0.9.jar]  [154MB/208MB 74%]
2024-10-29 16:37:29.363 INFO  i.g.d.a.c.m.JmbeAudioModule - JMBE audio conversion library loaded: JMBE Audio Conversion Library v1.0.9  [155MB/208MB 74%]
2024-10-29 16:37:29.363 INFO  i.g.d.a.c.m.ImbeAudioModule - JMBE audio conversion library IMBE CODEC successfully loaded - P25-1 audio will be available  [155MB/208MB 74%]
2024-10-29 16:37:29.388 INFO  i.g.d.d.f.c.ComplexPolyphaseChannelizerM2 - Sample Rate [2400000.0] providing [96] channels at [25000.0] Hz each  [156MB/208MB 75%]```
marshyonline commented 6 days ago

Hey @DSheirer The code base changes in this fix are causing sdrtrunk to lockup and eventually run out of memory now on my box. I do not have this issue on the current beta release - only post this fix. Here are the logs - I was unable to dump the diag port this time around due to the hard lockup:

sdrtrunk_app.log

image

tadscottsmith commented 5 days ago

I appreciate the hard work, thank you! If there's a way to sponsor you or buy you a coffee, please let me know!

DSheirer commented 4 days ago

@marshyonline can you please test the latest nightly release again and let me know if you're still seeing the out of memory issue? I reverted one of the recent changes that might have contributed to what you're seeing. If not, we'll need to use either VisualVM or Java Flight Recorder to get a snapshot of the Java memory being used after sdrtrunk has been running for a while to see what's holding onto memory.

I just pushed the code change up to master. The nightly release should rebuild in the next couple minutes.

marshyonline commented 4 days ago

image

@DSheirer VisualVM is new to me, so im not sure how to track down what float that is - any pointers?

DSheirer commented 4 days ago

Can you right-click on the heap dump in the Applications tree view and 'Save As ...' and then zip the folder and either post it here or send to me?

DSheirer commented 4 days ago

@marshyonline To clarify, this is a heap dump using the latest nightly build, or the beta 3 release?

marshyonline commented 4 days ago

Hi @DSheirer - ive sent you a file via Discord DM's This crash was on the current nightly

marshyonline commented 4 days ago

While cleaning up this box, i noticed that the event log DIR was 100G+ I removed the event logs folder, restarted the latest nightly, and have not had any of the above issues.

@DSheirer, I am wondering if the application had a fit due to the extremely large volume of logs I had. There might be a need for event log rolling?

This might also be related: https://github.com/DSheirer/sdrtrunk/issues/1350

image