eclipse-platform / eclipse.platform.runtime

Eclipse Public License 2.0
6 stars 14 forks source link

UI freezes when a ThreadJob is waiting to run in the UI Thread #40

Closed laeubi closed 2 years ago

laeubi commented 2 years ago

There is an annoying problem in m2e see https://github.com/eclipse-m2e/m2e-core/issues/123 even more worse, when this happens one could get eclipse completely stuck:

I can see that the "main" thread is waiting forever to be notified (@iloveeclipse that's what I recently noted about using simple blocking primitives) what seem to never happen (I see no other ThreadJob in the dump).

@jukzi by the way this is a good example that shows that BusyIndicator.showWhile actually is blocking the UI thread see at org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:74) in below stack-trace.

I have attached a full thread dump here freeze.txt it happened at

Version: 2022-06 M1 (4.24.0 M1) Build id: 20220414-1707

"main" #1 prio=6 os_prio=0 cpu=206488,83ms elapsed=2367,80s allocated=4660M defined_classes=16162 tid=0x00007f2380025800 nid=0x1e70 in Object.wait()  [0x00007f2387bca000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(java.base@11.0.8/Native Method)
    - waiting on <no object reference available>
    at org.eclipse.core.internal.jobs.ThreadJob.waitForRun(ThreadJob.java:324)
    - waiting to re-lock in wait() <0x00000000821828f8> (a java.lang.Object)
    at org.eclipse.core.internal.jobs.ThreadJob.joinRun(ThreadJob.java:208)
    at org.eclipse.core.internal.jobs.ImplicitJobs.begin(ImplicitJobs.java:95)
    at org.eclipse.core.internal.jobs.JobManager.beginRule(JobManager.java:311)
    at org.eclipse.core.internal.resources.WorkManager.checkIn(WorkManager.java:124)
    at org.eclipse.core.internal.resources.Workspace.prepareOperation(Workspace.java:2265)
    at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:2310)
    at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:2335)
    at org.eclipse.ltk.internal.core.refactoring.UndoableOperation2ChangeAdapter.executeChange(UndoableOperation2ChangeAdapter.java:351)
    at org.eclipse.ltk.internal.core.refactoring.UndoableOperation2ChangeAdapter.undo(UndoableOperation2ChangeAdapter.java:196)
    at org.eclipse.core.commands.operations.TriggeredOperations.undo(TriggeredOperations.java:214)
    at org.eclipse.core.commands.operations.DefaultOperationHistory.doUndo(DefaultOperationHistory.java:407)
    at org.eclipse.core.commands.operations.DefaultOperationHistory.undo(DefaultOperationHistory.java:1149)
    at org.eclipse.ui.operations.UndoActionHandler.runCommand(UndoActionHandler.java:86)
    at org.eclipse.ui.operations.OperationHistoryActionHandler.lambda$0(OperationHistoryActionHandler.java:288)
    at org.eclipse.ui.operations.OperationHistoryActionHandler$$Lambda$1860/0x00000001015a3840.run(Unknown Source)
    at org.eclipse.jface.operation.ModalContext.runInCurrentThread(ModalContext.java:434)
    at org.eclipse.jface.operation.ModalContext.run(ModalContext.java:352)
    at org.eclipse.jface.dialogs.ProgressMonitorDialog.run(ProgressMonitorDialog.java:470)
    at org.eclipse.ui.internal.operations.TimeTriggeredProgressMonitorDialog.access$1(TimeTriggeredProgressMonitorDialog.java:1)
    at org.eclipse.ui.internal.operations.TimeTriggeredProgressMonitorDialog.lambda$0(TimeTriggeredProgressMonitorDialog.java:155)
    at org.eclipse.ui.internal.operations.TimeTriggeredProgressMonitorDialog$$Lambda$1751/0x000000010173a840.run(Unknown Source)
    at org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:74)
    at org.eclipse.ui.internal.operations.TimeTriggeredProgressMonitorDialog.run(TimeTriggeredProgressMonitorDialog.java:167)
    at org.eclipse.ui.operations.OperationHistoryActionHandler.run(OperationHistoryActionHandler.java:301)
    at org.eclipse.jface.action.Action.runWithEvent(Action.java:474)
    at org.eclipse.jface.commands.ActionHandler.execute(ActionHandler.java:121)
    at org.eclipse.ui.internal.handlers.E4HandlerProxy.execute(E4HandlerProxy.java:97)
    at jdk.internal.reflect.GeneratedMethodAccessor66.invoke(Unknown Source)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.8/DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(java.base@11.0.8/Method.java:566)
    at org.eclipse.e4.core.internal.di.MethodRequestor.execute(MethodRequestor.java:58)
    at org.eclipse.e4.core.internal.di.InjectorImpl.invokeUsingClass(InjectorImpl.java:317)
    at org.eclipse.e4.core.internal.di.InjectorImpl.invoke(InjectorImpl.java:251)
    at org.eclipse.e4.core.contexts.ContextInjectionFactory.invoke(ContextInjectionFactory.java:173)
    at org.eclipse.e4.core.commands.internal.HandlerServiceHandler.execute(HandlerServiceHandler.java:156)
    at org.eclipse.core.commands.Command.executeWithChecks(Command.java:488)
    at org.eclipse.core.commands.ParameterizedCommand.executeWithChecks(ParameterizedCommand.java:487)
    at org.eclipse.e4.core.commands.internal.HandlerServiceImpl.executeHandler(HandlerServiceImpl.java:213)
    at org.eclipse.e4.ui.bindings.keys.KeyBindingDispatcher.executeCommand(KeyBindingDispatcher.java:308)
    at org.eclipse.e4.ui.bindings.keys.KeyBindingDispatcher.press(KeyBindingDispatcher.java:580)
    at org.eclipse.e4.ui.bindings.keys.KeyBindingDispatcher.processKeyEvent(KeyBindingDispatcher.java:647)
    at org.eclipse.e4.ui.bindings.keys.KeyBindingDispatcher.filterKeySequenceBindings(KeyBindingDispatcher.java:439)
    at org.eclipse.e4.ui.bindings.keys.KeyBindingDispatcher$KeyDownFilter.handleEvent(KeyBindingDispatcher.java:96)
    at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:89)
    at org.eclipse.swt.widgets.Display.filterEvent(Display.java:1924)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1528)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1555)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1538)
    at org.eclipse.swt.widgets.Widget.sendKeyEvent(Widget.java:1577)
    at org.eclipse.swt.widgets.Widget.gtk_key_press_event(Widget.java:937)
    at org.eclipse.swt.widgets.Control.gtk_key_press_event(Control.java:4014)
    at org.eclipse.swt.widgets.Composite.gtk_key_press_event(Composite.java:921)
    at org.eclipse.swt.widgets.Widget.windowProc(Widget.java:2434)
    at org.eclipse.swt.widgets.Control.windowProc(Control.java:6855)
    at org.eclipse.swt.widgets.Display.windowProc(Display.java:6170)
    at org.eclipse.swt.internal.gtk3.GTK3.gtk_main_do_event(Native Method)
    at org.eclipse.swt.widgets.Display.eventProc(Display.java:1552)
    at org.eclipse.swt.internal.gtk3.GTK3.gtk_main_iteration_do(Native Method)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:4530)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$5.run(PartRenderingEngine.java:1155)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:338)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1046)
    at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:155)
    at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:644)
    at org.eclipse.ui.internal.Workbench$$Lambda$156/0x00000001003c0040.run(Unknown Source)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:338)
    at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:551)
    at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:156)
    at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:152)
    at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:203)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:136)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:402)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:255)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.8/Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.8/NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.8/DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(java.base@11.0.8/Method.java:566)
    at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:659)
    at org.eclipse.equinox.launcher.Main.basicRun(Main.java:596)
    at org.eclipse.equinox.launcher.Main.run(Main.java:1467)
    at org.eclipse.equinox.launcher.Main.main(Main.java:1440)
jukzi commented 2 years ago

it's not blocking. its a wait(250) =>will update UI at least every 250ms, see also the comments 3 lines below "This thread may be interrupted via two common scenarios.."

jukzi commented 2 years ago

actually this whole refactoring issue is just a duplicate of Bug 578871

laeubi commented 2 years ago

Whatever it does, my eclipse is stuck, if I try to use any UI control my OS says the application is not responding and the only option is to forcefully kill the application, so it seems it is stuck in this block:

if (state == Job.RUNNING && canBlock) {
    blockingJob.jobStateLock.wait();
} else if (state != Job.NONE) {
    blockingJob.jobStateLock.wait(250);
}
jukzi commented 2 years ago

Whatever it does, my eclipse is stuck,

yes, that why i demand to fix it: https://github.com/eclipse-jdt/eclipse.jdt.ui/pull/19

laeubi commented 2 years ago

it's not blocking. its a wait(250) =>will update UI at least every 250ms,

I do not see any code in this while loop that actually updates the UI, that is calling Display.readAndDispatch anywhere, can you direct me to the place where this happens?

jukzi commented 2 years ago

yea, thats like magic in the blocker handler. How about you test my proposed fix for your situation?

laeubi commented 2 years ago

yea, that's like magic in the blocker handler

Can you give a code reference to this magic?

jukzi commented 2 years ago

use the patch and set a breakpoint at the method you search for.

laeubi commented 2 years ago

just a duplicate of Bug 578871

I don't think this is related here. This is not an UI freeze, it simply never finish, the Job is just waiting forever (whether its timed or not does not matter here) in ThreadJob.waitForRun as at this time there are no other jobs, for what is it waiting for?

jukzi commented 2 years ago

You complain "The Progress View freezes" and on the other hand say "This is not an UI freeze". contradictory.

Regarding the endless build loop you may want to debug why the build is triggered again. maybe this commits helps: https://github.com/eclipse-platform/eclipse.platform.resources/pull/45

laeubi commented 2 years ago

You complain "The Progress View freezes" and on the other hand say "This is not an UI freeze". contradictory.

Just keep the context and it is not that 'contradictory' at all.

Bug 578871 states that "UNDO does not show Cancel Dialog but freezes till autobuild finished" and that's what is reported by Eclipse as "UIFreeze", the UI is unresponsive for some time and then you can work again.

What I described here is that the progress view is constantly performing tasks (most likely m2e fault), and the suddenly get stuck, it shows a progress of 50%, but nothing happens anymore and the main thread is waiting forever that this Job is about to run. The application is now freezed but never get working again (or I'm not patient).

Beside that you explained that the timed wait is actually there to not freezed but at least I can't perform any actions including closing the application. So maybe there is a bug in the code (independent vom JDT) but Bug 578871 just triggers/makes this behavior visible... Don't get me wrong, maybe this could be fixed by Bug 578871 but then the general problem remains and other code can get stuck here as well...

My problem is that one can't really reliable reproduce the issue, it just "happens" from time to time (especially ofte do not happen when creating a new workspace, vanishes when you refresh a certain project or appears when switching branches) so its really hard to debug and I can only hope giving some vague hints hoping the right things get fixed to it once vanish :-)

vogella commented 2 years ago

IIRC I have seen issues with projects dependencies wrongly calculated in the past. Unfortunately I forgot which tracing option was the best to idenfy these. Maybe build/cycle or build/needbuild?

image

vogella commented 2 years ago

See also here for the information how to enable this trace flags to your regular IDE. https://www.vogella.com/tutorials/EclipsePerformance/article.html#using-the-build-in-tracing-facilities-of-eclipse I typically pipe the log output of such an Eclipse into an file so that I can look at these later if needed.

jukzi commented 2 years ago

Please separate the Problems: 1.) UI freezes 2.) autobuild is repeating.

that 1 and 2 occurs on the same time only makes the problem "only" even worse. So which of the both do you want to track with this github issue? 1) is bug 578871 - as far as its during refactor while autobuild is running and 2) is probably not a platform bug

laeubi commented 2 years ago

Please separate the Problems: 1.) UI freezes 2.) autobuild is repeating.

3.) Job is never run

one and two does not occur at the same time. 1 occurs just if 2 is spanning many new tasks no idea if its also triggering autobuilds ...

I simply don't want the Job API to freeze (whatever the UI, a job) forever regardless what code triggered a job ... so to make this more concrete:

It seems the UI freezes when a ThreadJob is waiting to run.

If you can reproduce the issue with some other code as well fine, but that's nothing to solve by m2e or jdt.

jukzi commented 2 years ago

3.) Job is never run

But the stacktrace you posted is as in bug 578871. It not a bug but a feature that a bug is never run as if the job has a conflicting IScheduleRule. during autobuild the whole workspace i locked. there will be no other job with any resource rule started until autobuild finished

vogella commented 2 years ago

Maybe the job tracing helps:

image

laeubi commented 2 years ago

It not a bug but a feature that a bug is never run as if the job has a conflicting IScheduleRule

Alright, if thats the case fine,

here will be no other job with any resource rule started until autobuild finished

but I don't see any other jobs in the Ui (might be just hidden...) and I can't work with the UI anymore, that's the bug I'd like to describe here, even if the Job takes long/forever/... the rest of the application should just run and you mentioned some "magic" that do this, but I can't see any 'magic' happen here so either the 'magic' is broken or ThreadJob is preventing the 'magic' from being happen and that should be fixed.

laeubi commented 2 years ago

The following snippet can be used to reproduce the issue:

final ISchedulingRule rule = new TestRule();
CountDownLatch countdown = new CountDownLatch(1);
Job job = Job.create("Runs until notified", new ICoreRunnable() {

    @Override
    public void run(IProgressMonitor monitor) {
        monitor.beginTask("Just waiting....", 100);
        Job.getJobManager().beginRule(rule, null);
        try {
            countdown.await();
        } catch (InterruptedException e) {
        } finally {
            Job.getJobManager().endRule(rule);
            monitor.done();
        }
    }
});
job.schedule();
Display display = new Display();
Shell shell = new Shell(display);
shell.setLayout(new FillLayout());
shell.setText("Hello World");
shell.setSize(200, 200);
Button button = new Button(shell, SWT.PUSH);
button.setText("Schedule me");
button.addSelectionListener(new SelectionListener() {

    @Override
    public void widgetSelected(SelectionEvent e) {
        button.setEnabled(false);
        Job.getJobManager().beginRule(rule, null);
        countdown.countDown();
        Job.getJobManager().endRule(rule);
        button.setText("Done!");
    }

    @Override
    public void widgetDefaultSelected(SelectionEvent e) {

    }
});
shell.open();
while (!shell.isDisposed()) {
    while (display.readAndDispatch()) {
    }
    display.sleep();
}
display.dispose();

Steps to reproduce:

  1. Start the snippet in debug mode and you will se a simple UI with a button
  2. Add breakpoint at Display#readAndDispatch and eg resize the dialog or something and you should see the method is triggered.
  3. remove the breakpoint and click the button, the UI will freeze
  4. Put a breakpoint at Display#readAndDispatch (will never trigger)

Threaddump shows:

Thread [main] (Suspended)   
    waiting for: Object  (id=92)    
    Object.wait(long) line: not available [native method]   
    Object.wait() line: 328 
    ThreadJob.waitForRun(ThreadJob, IProgressMonitor, InternalJob) line: 322    
    ThreadJob.joinRun(ThreadJob, IProgressMonitor) line: 208    
    ImplicitJobs.begin(ISchedulingRule, IProgressMonitor, boolean) line: 95 
    JobManager.beginRule(ISchedulingRule, IProgressMonitor) line: 315   
    SyncExecWhileUIThreadWaitsForRuleTest$2.widgetSelected(SelectionEvent) line: 89 
    TypedListener.handleEvent(Event) line: 252  
    EventTable.sendEvent(Event) line: 89    
    Display.sendEvent(EventTable, Event) line: 5863 
    Button(Widget).sendEvent(Event) line: 1529  
    Display.runDeferredEvents() line: 5094  
    Display.readAndDispatch() line: 4546    

So this could be reproduced without any complex setup and proofs that there is no magic at all that drives the event-queue (or it is not part of the Job API, or I just can't find it) and the job api statement is true that:

f the rule conflicts with another rule currently running in another thread, this method blocks until there are no conflicting rules.

This means, calling beginRule in the UI thread will block this thread until the rule could be acquired.

The issue why JDT triggers this is most likely in OperationHistoryActionHandler.run because that method forces to be run in the UI thread and if operation is not IAdvancedUndoableOperation2 and return runInBackground=truewill execute the whole operation in the UI thread and thus those operation must:

  1. run fast
  2. do not lock anything or fork/join anything

to be safe.

jukzi commented 2 years ago

You can not expect magic to happen when you do not install the magic UI handlers.

iloveeclipse commented 2 years ago

JobManager.beginRule

Calling this from UI thread is invitation for freezes and deadlocks. Quote from javadoc:

Begins applying this rule in the calling thread. If the rule conflicts with another rule currently running in another thread, this method blocks until there are no conflicting rules.

So the snippet shows the code works as designed :-)

laeubi commented 2 years ago

As long as no one can give a actual reference to some kind of source code I can't try to investigate that, its all open source so whats the reason not simply posting a link here?

Beside that, I get the exactly same behavior and call stack as with my inital report, so I really suspect that there is anything more happening...

Calling this from UI thread is invitation for freezes and deadlocks.

I always stated thatbut was always told there is some "magic" that prevents this, I haven't found any such magic yet and can hardly guess how it should work... please take a look at the original report you see exactly that sequence of calls:

at org.eclipse.core.internal.jobs.ThreadJob.waitForRun(ThreadJob.java:324) at org.eclipse.core.internal.jobs.ThreadJob.joinRun(ThreadJob.java:208) at org.eclipse.core.internal.jobs.ImplicitJobs.begin(ImplicitJobs.java:95) at org.eclipse.core.internal.jobs.JobManager.beginRule(JobManager.java:311)

then down the line: org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:74) and then look at how BusyIndicator.showWhile is implemented and how TimeTriggeredProgressMonitorDialog call it there is absolutely no chance that the Ui Thread is ever running as I reproduced with my example in a more simple way, and as I noted recently in a review and it was deemed invalid because the "magic"....

jukzi commented 2 years ago

As long as no one can give a actual reference to some kind of source code I can't try to investigate that, its all open source so whats the reason not simply posting a link here?

It's so complicated that i do not exactly know how it works ("magic") so i can't explain. I posted links to solutions multiple times and you did not try.

laeubi commented 2 years ago

It's so complicated that i do not exactly know how it works ("magic")

So most probably there is no such magic.

e.g. here https://github.com/jukzi/eclipse.platform.ui/blob/329dce1797998f2bca6d9951fd336b08e0e120cc/bundles/org.eclipse.ui.workbench/Eclipse%20UI/org/eclipse/ui/operations/OperationHistoryActionHandler.java#L321-L323 the begin rule is wrapped and if run in the UI thread will exactly trigger this behavior, just from the specification of the JVM there is no chance for any magic happening unless we use byte-code instrumentation in the running code to hot-replace swt classes with some magic stuff.

you did not try

Feel free to enhance my example with anything that you think is required, the links you posted just try to circumvent some specific cases but don't change how it generally works (or not) there is nothing I can run out of the box ...

laeubi commented 2 years ago

I now have added the "magic" parts to my example just insert:

IJobManager manager = Job.getJobManager();
Display display = new Display();
UILockListener listener = new UILockListener(display);
manager.setLockListener(listener);
UISynchronizer synchronizer = new UISynchronizer(display, listener);
display.setSynchronizer(synchronizer);

still this don't prevent the lock as it only scopes with sync/async exec calls from withing locks but can't interfere already running code unless the code returns to the UI-Loop what is simply not happening here.

laeubi commented 2 years ago

I have now added a testcase here https://github.com/eclipse-platform/eclipse.platform.ui/pull/38 there is one existing test, that tests the sync/async exec working and the new one showing where it fails.

laeubi commented 2 years ago

I now further debugged the reproducer:

As long as one passes a null or NullProgressMonitor to the JobManager.beginRule everything works perfectly fine, because then the JobManager calls the ProgressProvider, what is a org.eclipse.ui.internal.progress.ProgressManager what in turn creates a EventLoopProgressMonitor that then drives the event queue.

But if not (as I suspect is common) and e.g. pass a SubMonitor (many calls do that) then this is not working and the monitor itself is used and the event queue is stuck and UI freezes.

So a possible fix would be to add a new method ProgressProvider#wrapMonitor(IProgressMonitor)that is called here and wraps any monitor in a EventLoopProgressMonitor.

iloveeclipse commented 2 years ago

@laeubi : not sure where exactly do you want wrap what, but EventLoopProgressMonitor should only be used if the thread that created the progress monitor is supposed to drive UI queue (== UI thread), see for example org.eclipse.ui.internal.progress.ProgressManager.getDefaultMonitor().

In Job framework we have NO idea about SWT threading specials.

laeubi commented 2 years ago

see for example org.eclipse.ui.internal.progress.ProgressManager.getDefaultMonitor()

Exactly that method is effectively called, thus if you wait for a rule in UI thread you get the EventLoopProgressMonitor else a plain NullProgressMonitor and everything works "like magic", except you pass a non null (progres) monitor...

In Job framework we have NO idea about SWT threading specials.

Yes that's why there is org.eclipse.core.runtime.jobs.ProgressProvider what ProgressManager implements so thats the place where a #wrapMonitor could be provided (that simply returns IProgressMonitor.nullSafe(IProgressMonitor) as a default, then ProgressManager can override that method and return a EventLoopProgressMonitor if called from UI thread.