Closed spkrka closed 4 years ago
This has been discussed in #65. As stated there, I am reluctant to remove the synchronize given that duplicate
is not documented as thread safe.
Is this actually causing issues for you in real world usage? The synchronize is very limited in scope.
Yes, we noticed significant blocks on this synchronized using JFR during heavy load.
It does not really matter if duplicate is thread-safe or not, since the state of the bytebuffer is effectively immutable.
I also created some benchmarks using JMH to compare sync vs non-sync solution: I have both the code and the results in this commit: https://github.com/spkrka/MaxMind-DB-Reader-java/commit/76b3be0f26f4ea8cb3b9a092d294d2281b0b4b74
Benchmarks ran on a 32-core machine with Java 11
Summary of benchmark data:
* Performance is the same when threads <= 4
* Sync version throughput goes down as number of concurrent threads increase >= 32
* Non-sync version throughput stabilizes when increasing threads beyond cores.
* At 32 concurrent threads, non-sync version supports about 40% more throughput
* At 32 concurrent threads, the 95-percentile of latency is:
60 microseconds per op for non-sync and
580 microseconds per op for sync
duplicate
makes no guarantees that it will not mutate the state of the original buffer in some way as it runs. This is what I mean by not being documented thread-safe. That said, it seems hard to imagine a reason why any possible implementation would do this.
Given that you see significant lock contention at just 32 threads, perhaps the very small risk of some potential implementation introducing a data race is worthwhile. In the linked issue, the person was using 5000 threads.
I suppose if you have a read-only-buffer, the java implementation must not be allowed to mutate the content of the buffer - but as you say, that would be a weird thing for the java implementation to do regardless of whether it is read-only or not.
My concern isn't about the contents of the backing buffer, which will not change. It is about the internal state of the ByteBuffer
itself. This state is mutable even on a read-only buffer.
Yes, I agree on that. I was mostly trying to eliminate the buffer content as a variable in all this.
We also know that once the ByteBuffer is created in the BufferHolder class, its state is never mutated.
I suppose if you want to be extra sure the runtime itself won't mutate it, you could explicitly clean the duplicate like this:
duplicate = buffer.duplicate();
duplicate.clear();
return duplicate;
(I think that's unnecessary though)
What do you think? Do you think this can be merged as is or would you prefer something that explicltly clears it too? I make another PR for that
See alternative PR: https://github.com/maxmind/MaxMind-DB-Reader-java/pull/72
I like the other version, it should be safe even if a future version mutates the state in the ByteBuffer.
Any chance one of these changes could be merged (and released, perhaps with patch version bump)?
That would help us avoid setting up an internal fork and I think other users could also benefit from it.
We will likely merge this or some other change to avoid the lock, but it might be a little while before we merge and release. At the very least, I'd like to review the implementation for the various possible ByteBuffer
types we could be holding (as it is an abstract class) to ensure that none of the current implementations do anything concerning.
I looked through the code and found the following: If you're using memory mode, the ByteBuffer will be of type HeapByteBufferR If you're using memory mapped mode, the ByteBuffer will be of type DirectByteBufferR
The implementation of duplicate() in HeapByteBufferR:
public ByteBuffer asReadOnlyBuffer() {
return new HeapByteBufferR(hb,
this.markValue(),
this.position(),
this.limit(),
this.capacity(),
offset);
}
So all it does is create a new object, referencing the same byte array and the same offset and copying all those int fields (mark, position, limit, capacity). Nothing is mutating the original ByteBuffer
For DirectByteBufferR it is slightly different - duplicate is only defined in DirectByteBuffer, so it will return a DirectByteBuffer object instead.
It looks like this:
public ByteBuffer duplicate() {
return new DirectByteBuffer(this,
this.markValue(),
this.position(),
this.limit(),
this.capacity(),
0);
}
// And the constructor
DirectByteBuffer(DirectBuffer db, // package-private
int mark, int pos, int lim, int cap,
int off)
{
super(mark, pos, lim, cap);
address = db.address() + off;
cleaner = null;
att = db;
}
There is nothing here that mutates the original ByteBuffer in any way.
1.4.0 has been released with the synchronized
removed. Thanks!
Thank you!
This synchronized call can cause contention. There is no other code that is synchronized on the BufferHolder The internal buffer never mutates its state (position, limit, and mark)
Thus it should be safe to remove this synchronized call.