Closed quixoten closed 7 years ago
I've updated the implementation to remove the fail/rescue from ack timeout and nack handling. I also modified the behavior slightly so that we return immediately if the response is received before the ack. I don't think we care what happens to the ack when the response is delivered first.
Released as version 0.6.0.pre1
.
I was initially against adding a nack, but I think there are some benefits to it. Using a nack allows the client to backoff more intelligently and with smaller intervals. This PR adds support to the client, but does not add support to the server. Once all clients have been upgraded to this version, support for the server side can be added.
I'm wondering if it might make sense to remove ACK completely and just rely on NACKs and a single timeout. Doing so would remove remove the possibility for a request to be duplicated at the rpc layer. It would also remove the need to receive two responses for every request. On the other hand, it would no longer cover the case that all servers stop responding (if they're all OOM killed in close proximity for example). The ACK allows us to fail in 15 seconds rather than 60 seconds in that scenario.
RFC @abrandoned @film42