Closed tadscottsmith closed 6 days ago
It appears this may be an expansion of a pretty old issue, #814. I will leave this open with the additional context that in this case calls are still being partially processed on the Teardown channel.
@tadscottsmith there is a new menu item under the 'File' menu called "Processing Diagnostic Report". This was added back in February 2024, so any nightly release or the 0.6.1-beta2 release should have the menu option.
The next time that you see a channel stuck in TEARDOWN state, can you click this menu item and then look in the Logs directory and send me the generated report?
I've been struggling with this issue now for some time. Attached are the two dumps.
20241029_002701_sdrtrunk_processing_diagnostic_report.log 20241029_002706_sdrtrunk_thread_dump.log
Here's an example from my system. I've found it's pretty easy to get channels stuck in this state by mass editing aliases. If I select a couple of thousand of them and try to set them all to mute or unmute, the playlist editor will hang and cause the teardown issue.
20241028_105708_sdrtrunk_thread_dump.log 20241028_105704_sdrtrunk_processing_diagnostic_report.log
I swapped my Channelizer Type out from Polyphase to Heterodyne and I'm no longer seeing TEARDOWN issues. My only concern with Heterodyne is monitoring 3 sites in the same system, it can be quite busy and I'm not sure how it will perform given the text descriptions of each Type
@tadscottsmith @marshyonline I built a special version of the beta 2 release for Windows 10 x86-64 that you can access here on my Google Drive: https://drive.google.com/drive/folders/130eCl0Bv-G7RlsBMtIfiP9ZhdCmg_FpL?usp=sharing
If you don't want to use this version, you can checkout branch "1992-channel-stuck-in-teardown' and build your own version via the command line (gradle runtimeZipWindows)
Can you please run this version using the Polyphase Channelizer, and run it until one or more channels get stuck in teardown and then create the diagnostic report like you just did. Please post the diagnostic report(s) and the application log.
There's a chance that you won't get it to a state where channels are stuck in TEARDOWN because I enhanced the error handling in a couple places. If those enhancements are indeed catching the errors, you should see logging to that effect in the application log. In that case, please just send the application log so that I can see where it caught the error(s).
@DSheirer Ran the new build, first 2 calls I had got stuck, logs below for you sir 20241029_211737_sdrtrunk_processing_diagnostic_report.log 20241029_211742_sdrtrunk_thread_dump.log sdrtrunk_app.log
EDIT: Had another crack just to make sure I didn't break anything but same result 20241029_212736_sdrtrunk_processing_diagnostic_report.log 20241029_212740_sdrtrunk_thread_dump.log sdrtrunk_app.log
@marshyonline @tadscottsmith I made more changes and posted another custom build to my google drive. Can you repeat the process again and post logs? Channels may not get stuck now.
https://drive.google.com/drive/folders/130eCl0Bv-G7RlsBMtIfiP9ZhdCmg_FpL?usp=sharing
Winner Winner Chicken Dinner!
So far so good, no chan's stuck in TEARDOWN. I will leave running overnight and report back in the morning Thank you!
@marshyonline if you have a chance, can you post your application log. I'm trying to figure out what process upstream was failing that caused this to happen in the first place.
from what build- Current Beta Release, Fix 1 or Fix2?
Seem to have hit a snag with fix 2 Everything has locked up as per screenshot 20241029_230517_sdrtrunk_processing_diagnostic_report.log 20241029_230522_sdrtrunk_thread_dump.log sdrtrunk_app.log
Can you generate a diagnostic report or is the app fully locked up?
Can you check to see if a diagnostic report was auto generated in the Los directory?
To your recent question, the app log from the latest test build.
The Application hadnt locked up no, but it lost all its tuners and was no longer processing any data. I was able to generate the reports and attached below my screenshot :)
I have not been able to recreate the issue of channels stuck in TEARDOWN with your latest build. It's been running almost 6 hours with no issues.
@tadscottsmith can you send your app log once you're done? I added some logging and I'm curious if any of it triggered.
I am not seeing anything logged, but I'm not able to reproduce the stuck TEARDOWN state that was easily reproduceable yesterday. Here's the logs from a recent run that I did along side a CPU stress tester. I was able to get the Playlist Editor to completely hang for a long period, and could even see the waterfall stutter, but calls continued to clean themselves up and nothing got stuck in TEARDOWN.
The other instance that's been running almost 7 hours now ends with the same ComplexPolyphaseChannelizerM2 log entry.
2024-10-29 16:36:21.264 INFO i.g.d.icon.IconModel - loading icons file [C:\Users\Radio\SDRTrunk\settings\icons.xml] [14MB/256MB 5%]
2024-10-29 16:36:21.272 INFO i.g.d.icon.IconModel - Icons file not found at [C:\Users\Radio\SDRTrunk\settings\icons.xml] [15MB/256MB 6%]
2024-10-29 16:36:21.840 INFO i.g.d.log.ApplicationLog - Application Log File: C:\Users\Radio\SDRTrunk\logs\sdrtrunk_app.log [22MB/256MB 8%]
2024-10-29 16:36:21.937 INFO i.g.d.log.ApplicationLog - Failed to find build information. [8MB/256MB 3%]
2024-10-29 16:36:21.938 INFO i.g.d.log.ApplicationLog - [8MB/256MB 3%]
2024-10-29 16:36:21.938 INFO i.g.d.log.ApplicationLog - ******************************************************************* [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO i.g.d.log.ApplicationLog - **** sdrtrunk: a trunked radio and digital decoding application *** [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO i.g.d.log.ApplicationLog - **** website: https://github.com/dsheirer/sdrtrunk *** [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO i.g.d.log.ApplicationLog - ******************************************************************* [8MB/256MB 3%]
2024-10-29 16:36:21.939 INFO i.g.d.log.ApplicationLog - Memory Logging Format: [Used/Allocated PercentUsed%] [8MB/256MB 3%]
2024-10-29 16:36:21.940 INFO i.g.d.log.ApplicationLog - Host OS Name: Windows 10 [8MB/256MB 3%]
2024-10-29 16:36:21.941 INFO i.g.d.log.ApplicationLog - Host OS Arch: amd64 [8MB/256MB 3%]
2024-10-29 16:36:21.942 INFO i.g.d.log.ApplicationLog - Host OS Version: 10.0 [8MB/256MB 3%]
2024-10-29 16:36:21.943 INFO i.g.d.log.ApplicationLog - Host CPU Cores: 8 [8MB/256MB 3%]
2024-10-29 16:36:21.943 INFO i.g.d.log.ApplicationLog - Host Max Java Memory: 3 GB [8MB/256MB 3%]
2024-10-29 16:36:21.944 INFO i.g.d.log.ApplicationLog - Storage Directories: [8MB/256MB 3%]
2024-10-29 16:36:21.945 INFO i.g.d.log.ApplicationLog - Application Root: C:\Users\Radio\SDRTrunk [8MB/256MB 3%]
2024-10-29 16:36:21.945 INFO i.g.d.log.ApplicationLog - Application Log: C:\Users\Radio\SDRTrunk\logs [8MB/256MB 3%]
2024-10-29 16:36:21.946 INFO i.g.d.log.ApplicationLog - Event Log: C:\Users\Radio\SDRTrunk\event_logs [8MB/256MB 3%]
2024-10-29 16:36:21.948 INFO i.g.d.log.ApplicationLog - Playlist: C:\Users\Radio\SDRTrunk\playlist [8MB/256MB 3%]
2024-10-29 16:36:21.948 INFO i.g.d.log.ApplicationLog - Recordings: C:\Users\Radio\SDRTrunk\recordings [8MB/256MB 3%]
2024-10-29 16:36:21.982 INFO i.g.d.s.t.s.a.SDRPlayLibraryHelper - SDRPlay API native library not found at: C:\Program Files\SDRplay\API\x64\sdrplay_api [9MB/256MB 3%]
2024-10-29 16:36:22.002 INFO i.g.d.util.ThreadPool - Application thread pool created SCHEDULED and CACHED executors threads [9MB/256MB 3%]
2024-10-29 16:36:22.005 INFO i.g.d.p.SystemProperties - SystemProperties - loaded [C:\Users\Radio\SDRTrunk\SDRTrunk.properties] [9MB/256MB 3%]
2024-10-29 16:36:22.467 INFO i.g.d.s.t.m.TunerManager - Discovering tuners ... [8MB/256MB 3%]
2024-10-29 16:36:22.550 INFO i.g.d.s.t.m.TunerManager - LibUsb API Version: 1.0.262 [9MB/256MB 3%]
2024-10-29 16:36:22.551 INFO i.g.d.s.t.m.TunerManager - LibUsb Version: 1.0.22.11312 [9MB/256MB 3%]
2024-10-29 16:36:22.584 INFO i.g.d.s.t.m.TunerManager - LibUsb - discovered [11] potential usb devices [9MB/256MB 3%]
2024-10-29 16:36:22.592 INFO i.g.d.s.t.m.TunerManager - Discovered tuner at USB Bus [1] Port [1.4] Tuner Class [RTL-2832] [9MB/256MB 3%]
2024-10-29 16:36:22.598 INFO i.g.d.s.t.m.TunerManager - Tuner: USB Tuner - RTL-2832 USB Bus:1 Port:1.4 - Added / Starting ... [9MB/256MB 3%]
2024-10-29 16:36:23.988 INFO i.g.d.d.f.c.ComplexPolyphaseChannelizerM2 - Sample Rate [2400000.0] providing [96] channels at [25000.0] Hz each [13MB/256MB 5%]
2024-10-29 16:36:24.143 INFO i.g.d.s.t.m.TunerManager - LibUsb Hotplug event notification Is Not Supported on this platform. [14MB/256MB 5%]
2024-10-29 16:36:24.146 INFO i.g.d.s.t.s.api.SDRplay - API library is not available - unsupported version: 0.0 [14MB/256MB 5%]
2024-10-29 16:36:24.150 INFO i.g.d.s.t.m.TunerManager - Discovered [2] recording tuners [14MB/256MB 5%]
2024-10-29 16:36:24.151 INFO i.g.d.s.t.m.TunerManager - Tuner Added: Recording [C:\Users\Radio\Downloads\sdrpp_windows_x64 (1)\sdrpp_windows_x64\recordings\baseband_852562500Hz_16-57-47_22-10-2024.wav] [14MB/256MB 5%]
2024-10-29 16:36:24.152 INFO i.g.d.s.t.m.TunerManager - Tuner Added: Recording [C:\Users\Radio\SDRTrunk\recordings\RTL2832_852562500_baseband_20241022_171854.wav] [14MB/256MB 5%]
2024-10-29 16:36:24.157 INFO i.g.d.s.SettingsManager - SettingsManager - loading settings file [C:\Users\Radio\SDRTrunk\settings\settings.xml] [14MB/256MB 5%]
2024-10-29 16:36:24.324 INFO i.g.d.m.DiagnosticMonitor - Diagnostic monitoring enabled running every 30 seconds [11MB/34MB 33%]
2024-10-29 16:36:24.796 WARN i.g.d.v.VectorUtilities - CPU supports maximum SIMD instructions of Species[float, 4, S_128_BIT] [14MB/34MB 43%]
2024-10-29 16:36:25.896 INFO i.g.d.p.PlaylistManager - Loading playlist [C:\Users\Radio\SDRTrunk\playlist\default.xml] [17MB/34MB 51%]
2024-10-29 16:36:26.888 INFO i.g.dsheirer.gui.SDRTrunk - starting main application gui [52MB/120MB 43%]
2024-10-29 16:36:26.890 ERROR i.g.d.p.SystemProperties - Unable to load jar manifest - we're probably not running from a release jar [52MB/120MB 43%]
2024-10-29 16:37:29.345 INFO i.g.d.a.c.m.JmbeAudioModule - Loading JMBE library from [C:\Users\Radio\SDRTrunk\jmbe\jmbe-1.0.9.jar] [154MB/208MB 74%]
2024-10-29 16:37:29.363 INFO i.g.d.a.c.m.JmbeAudioModule - JMBE audio conversion library loaded: JMBE Audio Conversion Library v1.0.9 [155MB/208MB 74%]
2024-10-29 16:37:29.363 INFO i.g.d.a.c.m.ImbeAudioModule - JMBE audio conversion library IMBE CODEC successfully loaded - P25-1 audio will be available [155MB/208MB 74%]
2024-10-29 16:37:29.388 INFO i.g.d.d.f.c.ComplexPolyphaseChannelizerM2 - Sample Rate [2400000.0] providing [96] channels at [25000.0] Hz each [156MB/208MB 75%]```
Hey @DSheirer The code base changes in this fix are causing sdrtrunk to lockup and eventually run out of memory now on my box. I do not have this issue on the current beta release - only post this fix. Here are the logs - I was unable to dump the diag port this time around due to the hard lockup:
I appreciate the hard work, thank you! If there's a way to sponsor you or buy you a coffee, please let me know!
@marshyonline can you please test the latest nightly release again and let me know if you're still seeing the out of memory issue? I reverted one of the recent changes that might have contributed to what you're seeing. If not, we'll need to use either VisualVM or Java Flight Recorder to get a snapshot of the Java memory being used after sdrtrunk has been running for a while to see what's holding onto memory.
I just pushed the code change up to master. The nightly release should rebuild in the next couple minutes.
@DSheirer VisualVM is new to me, so im not sure how to track down what float that is - any pointers?
Can you right-click on the heap dump in the Applications tree view and 'Save As ...' and then zip the folder and either post it here or send to me?
@marshyonline To clarify, this is a heap dump using the latest nightly build, or the beta 3 release?
Hi @DSheirer - ive sent you a file via Discord DM's This crash was on the current nightly
While cleaning up this box, i noticed that the event log DIR was 100G+ I removed the event logs folder, restarted the latest nightly, and have not had any of the above issues.
@DSheirer, I am wondering if the application had a fit due to the extremely large volume of logs I had. There might be a need for event log rolling?
This might also be related: https://github.com/DSheirer/sdrtrunk/issues/1350
sdrtrunk Version Currently running commit 130e59f4c16189f4dbae011509108f80965f9928 .
Describe the bug Occasionally, likely due to temporary high resource utilization, a channel will get stuck in Teardown state. As far as I can tell, a channel stuck in this state will never release until the application is restarted. I have seen channels stuck in this state for hours, and even as new calls are received and processed on the channel, the state does not change. It appears that calls received on a channel in Teardown state are not recorded or broadcast.
Expected behavior After a given time period, it would be nice if the application tried to tear down the channel again. If the application sees a new call on the channel that is in "Teardown" state, it should reset the state and begin recording the call.
Screenshots Example of a channel in Teardown state, which is currently receiving and decoding a valid call. The call was not recorded or broadcast.
Desktop (optional - complete the following information):
Additional context 130e59f4c16189f4dbae011509108f80965f9928