bonsai-rx / bonsai

The compiler, IDE, and standard library for the Bonsai visual programming language for reactive systems
https://bonsai-rx.org
MIT License
141 stars 29 forks source link

Exceptions thrown can sometimes result in two error dialogs #1796

Open PathogenDavid opened 6 months ago

PathogenDavid commented 6 months ago

This came up as an unrelated bug discovered via https://github.com/bonsai-rx/bonsai/pull/1787

This happens due to the two HandleWorkflowError calls below:

https://github.com/bonsai-rx/bonsai/blob/88a2412f25c467a1ef3505477413900484518e49/Bonsai.Editor/EditorForm.cs#L1229-L1241

First the call on line 1231 invokes the main thread to display the error, then the exception is rethrown on line 1233. It's then caught a second time by the onError handler on line 1241.

If I remember correctly, during a more typical exception the first call will show a dialog but the second one will only highlight the node and put the exception in the status bar.

This difference comes down to the status of building in the first branch here:

https://github.com/bonsai-rx/bonsai/blob/88a2412f25c467a1ef3505477413900484518e49/Bonsai.Editor/EditorForm.cs#L1413-L1418

It is however worth noting that two dialogs will appear no matter what when the second branch is entered. (As with the original 1787 fix.) So even if this comes down to something related to building there's still the potential for issues.

The handle call on line 1231 was added by https://github.com/bonsai-rx/bonsai/commit/724b2e06467da7a4b29de82d706cd3d43e423ceb which was trying to fix a bug introduced by https://github.com/bonsai-rx/bonsai/commit/9c8e390ed8c2b82c492ed19563311f68c8de2ed5 for https://github.com/bonsai-rx/bonsai/issues/908

We theorized that potentially this fix was actually trying to fix what was actually a race condition in reading building by HandleWorkflowError, but there's not enough context to say for sure. (See https://github.com/bonsai-rx/bonsai/pull/1795)

We agreed it didn't make much sense to have two HandleWorkflowError calls here, but because the issue supposedly fixed by https://github.com/bonsai-rx/bonsai/commit/9c8e390ed8c2b82c492ed19563311f68c8de2ed5 is pretty serious (two error dialogs is practically benign compared to completely silent errors), we don't want to risk a regression here by reverting that commit blindly and decided this issue warrants further investigation.

glopesdev commented 6 months ago

@PathogenDavid interestingly, on my build and machine I only ever see one dialog, even though both code paths do get triggered.

Even more interesting, if I comment the first call site, I don't see a popup, which was the situation presumably fixed by 724b2e06467da7a4b29de82d706cd3d43e423ceb.

Digging deeper, it looks like we are seeing the effects of capturing the building state-variable, since the second call starts with building set to true but by the time the variable is actually checked inside the BeginInvoke call, it is already set to false.

glopesdev commented 6 months ago

I have identified one possible candidate for the race condition behavior, which is the use of ScheduledDisposable in the ShutdownSequence method:

https://github.com/bonsai-rx/bonsai/blob/88a2412f25c467a1ef3505477413900484518e49/Bonsai.Editor/EditorForm.cs#L1172

Effectively this means that the call to Dispose runs inside a BeginInvoke to the main dialog. Because both HandleWorkflowError and ShutdownSequence were queueing invokes, there is a race in the message queue. In my case the race seems to be deterministic, and meant that by calling HandleWorkflowError before ShutdownSequence I was able to trust that the first call to show the dialog would always come before the shutdown.

I'm not sure how in your case you were able to get the second call scheduled before the dispose, but in any case I think using the fix proposed in #1795 together with removal of the first call to HandleWorkflowError should resolve this completely.

PathogenDavid commented 5 months ago

interestingly, on my build and machine I only ever see one dialog, even though both code paths do get triggered.

This is because of the move from InvalidOperationException to WorkflowBuildException, as that alone fixed this issue indirectly.


race condition stuff

So looking at it again, the capture in https://github.com/bonsai-rx/bonsai/pull/1795 is actually not really the right thing to do here. It actually creates a race condition rather than preventing it.

When we were in our call I overlooked the fact that ShutdownSequence was being deferred to the UI thread as well. This actually means that capturing is causing a race condition between the UI thread setting building to false and the background thread capturing building in HandleWorkflowError.

As such capturing building is not the right thing to do here.

there is a race in the message queue. In my case the race seems to be deterministic

I checked the implementation of BeginInvoke here. Multiple BeginInvoke invocations scheduled by the same background thread will always be completed in the order they were submitted.

There's no race concern here, but is there a reason we're using BeginInvoke so aggressively? IMO fire and forget should be the exception, not the rule.

Hiding things in the ScheduledDisposable that is only sometimes disposed at certain places also isn't ideal IMO. It seems like ShutdownSequence should just be a method called from a catch or finally clause as needed based on the context.


As for this issue (the two error dialogs), I think the real problem here is caused these two facts:

  1. HandleWorkflowError is called twice (and we don't know why.)
  2. HandleWorkflowError will always show a dialog for non-workflow exceptions.

These two facts combine mean that non-workflow exceptions (as with the original implementation of https://github.com/bonsai-rx/bonsai/pull/1787) will always show two dialogs.

It might be easier to just handle the second fact for now and worry about the first one for Bonsai 3.0. I think EditorForm likely gets refactored to accomplish some of the 3.0 goals and refactoring probably properly fixes things anyway.

PathogenDavid commented 4 months ago

Found another case which appears to trigger this: https://github.com/bonsai-rx/gui/issues/29