Closed jazg closed 3 years ago
@jazg do you have an example demonstrating the problem? I understand this method needs some changes, but it would be great if I can see the exact problem first
Nvm, reproduced the exact error, will push fix
Ok, this was a tricky one. Two issues,
a) We never signaled the cond
b) The map we were using was keeping copies of the pending
struct rather than a pointer
-- Edit
To clarify, even after adding pending.cond.Broadcast()
to the signal
method, I noticed that even though pending.content
was being set, it was still nil
in the wait
method. Which makes sense, since pending
is a struct and pending.content
(byte slice) is a struct too, which means we were only assigning to a field of a copy and the goroutines were waiting on another copy where they never got to access the information. However, all goroutines waiting on cond
would still wake up on calling pending.cond.Broadcast()
because it is a pointer, and both structs point to the same sync.Cond
variable
@rahulghangas Fix looks pretty good to me (although I still need to test it out)! I think the issue with the context still exists – I can push a quick fix for this though (edit: pushed https://github.com/renproject/aw/commit/856a76f5b44b83ec08516227b2e0af9477f56b6f).
There seem to be at least two issues in the syncer:
https://github.com/renproject/aw/blob/ac3835b6789ad7260e74dce6be88cf9bdedbb2db/peer/sync.go#L110-L117
DidReceiveMessage
function is called and the pending content flag is signalled. Is there a reason we can't have something as simple as the following?