Closed whyrusleeping closed 2 years ago
Alternatively, we can just add the streams session shutdown channel to the wait loop select.
I'm more convinced of this being an issue with the internet connection of a node dropping. when i see this failure, multiple different yamux connections get hung all at the same time, which indicates to me some external trigger for the issue.
Any updates on this? Why is this not merged?
@r0l1 unclear if it solves any problem.
Given the age of this issue, I'm going to go ahead and close it. #46 attempts to handle a situation which is already handled, namely that a session that dies force closes all it's streams, so there shouldn't be a hang from that.
I've been having some issues with yamux writes hanging lately. See goroutine 14008165 in: https://ipfs.io/ipfs/QmQn8YDTHWJzF7GVKUKanBzoxbzd5jQpwsgM2PfFVTKuGR
I think its also correlated with the internet connection of the machine its running on dropping during a write. Heres what i suspect is happening:
WAIT
code (where we see the write being stuck in the linked stack trace)I'm not sure if the above is exactly whats happening, but i'm quite confident that if we somehow ended up in the write wait loop after the stream has been closed, its possible that the sendNotifyCh signal got missed and we will block forever. To address that possibility, I think that we should close the notify channels when the streams get closed, so that they are always ready to receive on.
cc @slackpad