Remora / Remora.Discord

A data-oriented C# Discord library, focused on high-performance concurrency and robust design.
GNU Lesser General Public License v3.0
246 stars 44 forks source link

[Bug]: ResponderDispatchService.DisposeAsync() can deadlock #305

Closed Cyberboss closed 1 year ago

Cyberboss commented 1 year ago

Description

The _finalizer Task relies on the _dispatcher Task closing the channel to be able to complete. But it's possible for this to never happen.

The channel completion call relies on several throws possible from the CancellationToken not happening.

The deadlock is usually avoided because the _finalizer Task also usually throws out before it gets to that point

All the lines listed can throw TaskCanceledExceptions. The deadlock occurs if any of the first group throw and none of the second group throw. At that point, the app will halt forever when it awaits the Reader's completion.

Steps to Reproduce

It's very difficult to reproduce. The only example I have is a dump of https://github.com/tgstation/tgstation-server deadlocked at this point. For security reasons (bot tokens and SQL credentials) I cannot share it publicly.

Expected Behavior

ResponderDispatchService.DisposeAsync() should complete eventually.

Current Behavior

ResponderDispatchService.DisposeAsync() can deadlock waiting for a channel completion event that never comes.

Library / Runtime Information

dotnet: 6.0 Remora.Discord: 2022.48.0 tgstation-server: 5.12.4 OS: Windows Server 2022

Cyberboss commented 1 year ago

Also the .Completion event won't fire if there are items still in the channel https://stackoverflow.com/a/66521303

VelvetToroyashi commented 1 year ago

Quite the peculiar set of issues, that's for sure. I've been hoping to rewrite mass swaths of Remora's codebase[1][2], so I suppose I'll definitely tack this onto the list; rewriting dispatch would certainly be a worthwhile investment based on my investigations[3]

1 #254 2 Remora.Commands PR 3 Discussion in the Discord

Cyberboss commented 1 year ago

This hacky reflection workaround was able to solve our issue: https://github.com/tgstation/tgstation-server/pull/1509

Nihlus commented 1 year ago

Interesting - thank you for the in-depth analysis. I've pushed a suggested fix to the deadlock-fix branch, which should alleviate the problem and fix another issue I uncovered. Please take a look when you have time.

Cyberboss commented 11 months ago

I've integrated this in my software and haven't heard the same bug report, so it looks to be working

Nihlus commented 11 months ago

Excellent!