Closed buritos closed 4 years ago
Turns out that we are hitting the bug that was discovered: https://travis-ci.org/vlingo/vlingo-wire/builds/635019329#L2492
So happy we found this!
Thanks for assigning the task @VaughnVernon ❤️
@buritos In general this looks like what we need. I will await your clarification on the
put(...)
issues as you see them.
@VaughnVernon, my theory is that all the copying can be eliminated with some rearrangement of the code that performs the release. My gut feeling is that the code is structured in this way (eager to release) because of the limitations of the previous pool implementation (failure to meet demand). Given the scalable nature of the new implementation (we pay the price for it), it is ok to hold on to a buffer instance for longer, meaning that it is ok to release it at the final consumer, as long as the producers do not attempt to reuse it in any way (they should always acquire another buffer from the pool).
@alexguzun confirms (although a bit concerned about the rippling effects of changing the API for ManagedOutboundChannel::write
). I believe we can work around that, preserving semantics of write(ByteBuffer)
by overloading write
to also accept ConusmerByteBuffer
, then implement write(ByteBuffer)
in terms of write(ConsumerByteBuffer)
, all of which are very local to wire.
Let me know if you think that the performance gains from such change are worth it to prioritize the changes now, or something that you would like to see in the future, or not at all :)
With the reverted test enablement is the bug is not fix, or you are working on this further?
Also, we definitely need to fix this. The copying is expensive and in this wire API is used in multiple components, or could be. I think the current changes and bug fixes you have made and that we benefit from immediately should be merged today. The tuning should be entered as an issue and fixed soon, but not immediately.
With the reverted test enablement is the bug is not fix, or you are working on this further?
I would like to try the tests again after eliminating catch(Throwable)
from all the places.
Also, we definitely need to fix this. The copying is expensive and in this wire API is used in multiple components, or could be. I think the current changes and bug fixes you have made and that we benefit from immediately should be merged today. The tuning should be entered as an issue and fixed soon, but not immediately.
I've made an issue. You are welcome to merge this in its current state. I haven't checked upstream for buffer release in all the places I removed it from here. However, it is low risk due to the nature of the new pool (allocates as necessary). The only issue is that leaks interfere with scaling down the pool after bursts.
The only issue is that leaks interfere with scaling down the pool after bursts.
Merge done. Could you elaborate why/how this happens?
The InUse counter is not decremented, tricking the pool to think that there is more demand than there actually is (leaked buffers are garbage collected).
Got it, thanks.
ByteBufferPool
;ConsumerByteBuffer
more than once.PoolAwareConsumerByteBuffer::release
is made idempotent and logs warning when it is called on the same instance more than once;ConsumerByteBuffer
when it is handed over to an async operation. The buffer is released on caught exceptions but not in afinally
block, since that would most likely run in parallel with the async operation.Possible issues with merging this PR for upstream:
ConsumerByteBuffer
leaks from the pool due to not being properly released in async operations implemented upstream. This is not too big of a deal since the new pool is unbounded and therefore leaked buffers will not cause an availability issue. However, leaking buffers interferes with the pool's ability to scale down and failure to release buffers will eventually cause memory issues (large footprint, more frequent GC). See inline comments for places where we should investigate.