ChristopheJacquet / RdsSurveyor

Multi-platform Radio Data System decoder
http://rds-surveyor.jacquet.xyz/
36 stars 10 forks source link

Clicking “Next station” when playing back a recorded file freezes RDS Surveyor #41

Closed mvglasow closed 3 years ago

mvglasow commented 3 years ago

Steps to reproduce:

  1. Grab any file in RDS Spy format with samples from multiple stations (or assemble one).
  2. Play back the file in RDS Surveyor (either with -infile or selecting it from the GUI)
  3. Wait for the data from the first station to be processed.
  4. Click “Next Station”.

Expected result:

The next hunk of RDS data is processed and data appears in the app window.

Actual result:

RDS Surveyor freezes completely.

Versions used:

88d0bb0, as well as an earlier one, on Ubuntu 20.04, OpenJDK 8.

Additional information:

I suspect this is a long-standing thread sync issue, which has started to manifest only now because of some external change, but has been in the code for very long.

This occurs regardless of whether “Simulate real time” is selected or not. I have only recently upgraded to Ubuntu 20.04 (from 18.04)—while I have used RDS Surveyor only infrequently since then, I have not noticed this issue on 18.04 (I did have sporadic freezes but cannot remember if they were related). On 20.04 I can reliably reproduce the bug.

The bug has so far only occurred with data read from a file (I have only tried HexFileGroupReader). When reading live input from a RTL2832U or a Si470x tuner, I can listen to the radio, view RDS data and change stations.

When RDS Surveyor is frozen, it does not seem to generate any significant processor load—this appears to be a deadlock rather than an endless loop, probably a thread sync issue.

The same thing happens when I run RDS Surveyor from inside Eclipse—either using “Run” or “Debug”. Call stack while running:

Daemon Thread [Timer-0] (Running)   
Thread [AWT-EventQueue-1] (Running) 
Daemon Thread [Thread-2] (Running)  
Daemon Thread [HSQLDB Timer @4fb64261] (Running)    
Thread [RDS-Worker] (Running)   
Thread [RDSSurveyor-DumpDisplay-updater] (Running)  
Thread [RDSSurveyor-MainWindow-updater] (Running)   
Thread [DestroyJavaVM] (Running)    

Call stack after suspending the VM:

Daemon Thread [Timer-0] (Suspended) 
    waiting for: TaskQueue  (id=58) 
    Object.wait(long) line: not available [native method]   
    TimerThread.mainLoop() line: 552    
    TimerThread.run() line: 505 
Thread [AWT-EventQueue-1] (Suspended)   
    waiting for: ODAPanel  (id=59)  
    ODAPanel.update() line: 37  
    MainWindow$3$1.run() line: 503  
    InvocationEvent.dispatch() line: 311    
    AtkWrapper$6(EventQueue).dispatchEventImpl(AWTEvent, Object) line: 758  
    EventQueue.access$500(EventQueue, AWTEvent, Object) line: 97    
    EventQueue$3.run() line: 709    
    EventQueue$3.run() line: 703    
    AccessController.doPrivileged(PrivilegedAction<T>, AccessControlContext) line: not available [native method]    
    ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(PrivilegedAction<T>, AccessControlContext, AccessControlContext) line: 74   
    AtkWrapper$6(EventQueue).dispatchEvent(AWTEvent) line: 728  
    AtkWrapper$6.dispatchEvent(AWTEvent) line: 705  
    EventDispatchThread.pumpOneEventForFilters(int) line: 205   
    EventDispatchThread.pumpEventsForFilter(int, Conditional, EventFilter) line: 116    
    EventDispatchThread.pumpEventsForHierarchy(int, Conditional, Component) line: 105   
    EventDispatchThread.pumpEvents(int, Conditional) line: 101  
    EventDispatchThread.pumpEvents(Conditional) line: 93    
    EventDispatchThread.run() line: 82  
Daemon Thread [Thread-2] (Suspended)    
    Unsafe.park(boolean, long) line: not available [native method]  
    LockSupport.park(Object) line: 175  
    FutureTask<V>.awaitDone(boolean, long) line: 429    
    FutureTask<V>.get() line: 191   
    AtkUtil.invokeInSwing(Callable<T>, T) line: 68  
    AtkText.get_character_count() line: 216 
Daemon Thread [HSQLDB Timer @4fb64261] (Suspended)  
    waiting for: HsqlTimer$TaskQueue  (id=57)   
    Object.wait(long) line: not available [native method]   
    HsqlTimer$TaskQueue.park(long) line: not available  
    HsqlTimer.nextTask() line: not available    
    HsqlTimer$TaskRunner.run() line: not available  
    Thread.run() line: 748  
Thread [RDS-Worker] (Suspended) 
Thread [RDSSurveyor-DumpDisplay-updater] (Suspended)    
    Thread.sleep(long) line: not available [native method]  
    DumpDisplay$2.run() line: 135   
Thread [RDSSurveyor-MainWindow-updater] (Suspended) 
    waiting for: MainWindow  (id=60)    
    MainWindow$3.run() line: 468    
Thread [DestroyJavaVM] (Suspended)  
Daemon System Thread [Signal Dispatcher] (Suspended)    
Daemon System Thread [Finalizer] (Suspended)    
    waiting for: ReferenceQueue$Lock  (id=62)   
    Object.wait(long) line: not available [native method]   
    ReferenceQueue<T>.remove(long) line: 144    
    ReferenceQueue<T>.remove() line: 165    
    Finalizer$FinalizerThread.run() line: 216   
Daemon System Thread [Reference Handler] (Suspended)    
    waiting for: Reference$Lock  (id=63)    
    Object.wait(long) line: not available [native method]   
    Reference$Lock(Object).wait() line: 502 
    Reference<T>.tryHandlePending(boolean) line: 191    
    Reference$ReferenceHandler.run() line: 153  
Daemon System Thread [Java2D Disposer] (Suspended)  
    waiting for: ReferenceQueue$Lock  (id=64)   
    Object.wait(long) line: not available [native method]   
    ReferenceQueue<T>.remove(long) line: 144    
    ReferenceQueue<T>.remove() line: 165    
    Disposer.run() line: 148    
    Thread.run() line: 748  
Daemon System Thread [AWT-XAWT] (Suspended) 
    XToolkit.waitForEvents(long) line: not available [native method]    
    XToolkit.run(boolean) line: 570 
    XToolkit.run() line: 534    
    Thread.run() line: 748  
System Thread [AWT-Shutdown] (Suspended)    
    waiting for: Object  (id=65)    
    Object.wait(long) line: not available [native method]   
    Object.wait() line: 502 
    AWTAutoShutdown.run() line: 295 
    Thread.run() line: 748  
Daemon System Thread [TimerQueue] (Suspended)   
    Unsafe.park(boolean, long) line: not available [native method]  
    LockSupport.park(Object) line: 175  
    AbstractQueuedSynchronizer$ConditionObject.await() line: 2039   
    DelayQueue<E>.take() line: 211  
    TimerQueue.run() line: 174  
    Thread.run() line: 748  
mvglasow commented 3 years ago

I see the following threads waiting for something:

Daemon Thread [Timer-0] (Suspended) 
    waiting for: TaskQueue  (id=58) 
Thread [AWT-EventQueue-1] (Suspended)   
    waiting for: ODAPanel  (id=59)  
    ODAPanel.update() line: 37  
    MainWindow$3$1.run() line: 503  
Daemon Thread [HSQLDB Timer @4fb64261] (Suspended)  
    waiting for: HsqlTimer$TaskQueue  (id=57)   
Thread [RDSSurveyor-MainWindow-updater] (Suspended) 
    waiting for: MainWindow  (id=60)    
    MainWindow$3.run() line: 468    
Daemon System Thread [Finalizer] (Suspended)    
    waiting for: ReferenceQueue$Lock  (id=62)   
Daemon System Thread [Reference Handler] (Suspended)    
    waiting for: Reference$Lock  (id=63)    
Daemon System Thread [Java2D Disposer] (Suspended)  
    waiting for: ReferenceQueue$Lock  (id=64)   
System Thread [AWT-Shutdown] (Suspended)    
    waiting for: Object  (id=65)    

The Daemon System Threads only showed up after suspending the VM, hence they are probably not relevant.

AWT-EventQueue-1 is stuck at the start of ODAPanel.update(), which is synchronized with the ODAPanel instance. This is part of a Runnable placed there via SwingUtilities.invokeLater(); the scheduling happens in RDSSurveyor-MainWindow-updater, inside a block synchronized to the MainWindow instance.

RDSSurveyor-MainWindow-updater is waiting for MainWindow.

Then there’s two timer threads waiting for their task queues—probably normal behavior for a timer when no task is currently running. AWT-Shutdown waiting for an object looks like AWT waiting for the main window to be closed before cleaning up and exiting—looks pretty regular to me.

On a rerun, AWT-EventQueue-1 is now waiting for MainWindow, not ODAPanel. Both are now held by the RDS-Worker thread. A call stack reveals that the worker thread invokes MainWindow#windowUpdaterVisitor(), which updates the application window with new RDS data on the calling thread. Upon visiting a StationTuned, it locks MainWindow, and in that block, it updates some data, including a call to ODAPanel#setStation() (which causes it to obtain a lock on ODAPanel as well). Similar situations could arise with RTPanel.

At this point I wondered if it makes sense to move these things into the AWT thread. I tried, and it seems to work—no more deadlocks. Wait for the PR…