Open bubenheimer opened 9 months ago
Same problem with a (filled) default buffered Channel, or runBlocking()
+ send()
.
The thread was still hanging after 5 minutes.
I assume the issue occurs whenever the sender blocks due to channel send suspending, followed by the channel being closed.
Channel.cancel() instead of Channel.close() appears to work around the issue.
This is a by-design behaviour, that might be potentially tricky to figure out behind all the other machinery.
Basically, it boils down to the following example:
runBlocking {
val ch = Channel<Unit>()
launch {
println("Before")
ch.send(Unit)
println("After")
}
yield()
println("Closing")
ch.close()
println("Closed")
}
The close
contract states:
Immediately after invocation of this function,
[isClosedForSend] starts returningtrue
. However, [isClosedForReceive][ReceiveChannel.isClosedForReceive on the side of [ReceiveChannel] starts returningtrue
only after all previously sent elements
are received.
In fact, we spent quite some time figuring out whether this behaviour should be the default one. These are the key takeaways:
send
coroutine does not differ from an element in the buffer -- it's just a backpressure applied. Thus, close
should not fail send
coroutine that is suspended in send -- the element still has to be received, the caller of close
only indicated that there won't be any new elements sent or work done.The approach has intrinsic downsides as well though (e.g. the code that send
s concurrently and is not properly synchronized with close
is inherently racy, sendBlocking
might deadlock, there are scenarios when one expects send
to fail immediately etc.) but in the current state of the library, this behaviour cannot be changed as it constitutes a major breaking change with vaguely defined impact (on the scale from "nobody notices" to "the end user applications start crashing in runtime").
Regarding the particular callbackFlow
use-case, I'd recommend using the advice from its documentation:
Using [awaitClose] is mandatory in order to prevent memory leaks when the flow collection is cancelled,
otherwise the callback may keep running even when the flow collector is already completed.
In your scenario, the machinery might look like this:
var callback: ((Int) -> Boolean)? = null // isDone
val thread = Thread {
repeat(Int.MAX_VALUE) {
if (callback!!(it)) {
return@Thread
}
}
}
val flow = callbackFlow {
callback = {
println("Sending $it")
try {
trySendBlocking(it).exceptionOrNull()?.let { println(it) }
println("Sent")
false
} catch (e: InterruptedException) {
Thread.interrupted()
println("Callback terminated")
true
}
catch (t: Throwable) {
println("Send fail: $t")
throw t
}
}
thread.start()
awaitClose {
thread.interrupt()
thread.join()
}
}
For more sophisticated use-cases we also have runInterruptible
function
@qwwdfsad I don't think this issue should be closed. Thank you for digging into the technical implementation details, but I don't think it addresses the usability issue of the API.
Perhaps I have not made the full extent of the problem clear. The current implementation of callbackFlow
looks exceedingly dangerous in general, as it is likely to cause intermittent hangs, and it does not carry warnings to this end. A crash here would be great by comparison, but all it does is hang and make the problem invisible, which helps explain why it was not reported before.
The key ingredients necessary to create this problem are callbackFlow() with a buffer of finite size and trySendBlocking()
. This is a common combination.
Cancelling the channel instead of closing it seems to not cause this issue. There may be reasons not do this, not sure. Otherwise it would seem appropriate to replace the current callbackFlow()
channel.close() approach with something less dangerous and deprecate callbackFlow().
Regarding the Thread.interrupt() workaround: isn't this possible here only because the caller has a great amount of knowledge about the other thread and control over it? This would seem rare when using callbackFlow() - callbackFlow() is more for callbacks from legacy code. I generally use callbackFlow() with Android framework callbacks. Interrupting Android framework threads or handling their interrupts myself seems a bad idea, likely to cause other problems. In many cases I have no control over those threads at all.
I don't see how runInterruptible()
would help. The typical use of callbackFlow just passes a value from a legacy callback to a channel.
In my actual use case I use trySendBlocking()
and a small buffer to create backpressure, which is why I was able to isolate the problem. In other cases the result would be intermittent, random hangs and developer frustration.
@qwwdfsad one question about another possible workaround, based on your technical analysis: it sounds like it would help to retrieve all remaining elements from the Channel in awaitClose() until I see isClosedForReceive. Would this be guaranteed to unblock the sender? If so, it may be the best workaround in my case.
Edit: I see that I only get a SendChannel in callbackFlow, so this is not possible to do as an API user, and I cannot cancel the channel either.
@qwwdfsad just to be clear, in actual usage I deregister the callback in awaitClose()
. This does not typically imply that the underlying framework will attempt to interrupt blocked callback threads, it usually just discontinues future use of the callback.
Thanks for pursuing this. I'm reopening this and will answer later
Describe the bug
SendChannel.trySendBlocking()
hangs the calling thread indefinitely when an unbuffered callbackFlow channel closes. Instead I expecttrySendBlocking()
to fail with an appropriate channel-related exception in a timely fashion once the channel closes.Coroutines version 1.7.3
Provide a Reproducer
Output below:
The output shows that the callback thread hangs after the channel closes until it is forcibly interrupted from another thread.