Closed linchuan4028 closed 3 years ago
I guess this code might be the root cause
if the h.data is nil, there is no chance to call h.mutex.Unlock()
func (h *singleConnectionHeap) Poll() (res *Connection) {
h.mutex.Lock()
// the heap has been cleaned up
if h.data == nil {
return nil
}
// if heap is not empty
if (h.tail != h.head) || h.full {
res = h.data[h.head]
h.data[h.head] = nil
h.full = false
if h.head == 0 {
h.head = h.size - 1
} else {
h.head--
}
}
h.mutex.Unlock()
return res
}
Thanks for your PR. I will reject it because we have been avoiding defer in hot path on purpose, but will include the fix and release it today.
The fix was released in v3.1.1 today.
Hi @khaf I'm interested in the purpose to avoid defer here. What's the benefit of it? It seems using defer will not increase the locking duration for this function.
Thanks ahead.
Defer has its own overhead, that's why I've avoided it in the hot path. As you can see, I have used it in other parts of the code. Keep in mind that some of this may not show up in consumer CPUs. Our benchmarks were on a dual processor system with high core (64) CPUs.
below is the stack it's show the 361 minutes block the issue is found in production as we encounter one aerospike node down. It doesn't recovered even though the node back