Move from ByteBuffer to Memory

leventov commented 7 years ago

The goal of this issue is to agree on basic design principles and expectations from this refactoring.

Goals

Speed up critical places in query processing by skipping bound checks, which ByteBuffer always performs when making any read or write. Bound checks could be hoisted outside the hot loops manually or enabled via assertions/static final boolean flag, that is effectively free with some precautions.
Get rid or 2 GB limitation, use long indexes. See #3743.
Don't make a lot of Object copies when a Memory object has to be "sliced", "converted to read-only", etc. See https://github.com/druid-io/druid/pull/3716#issuecomment-274658045.
+ Eliminate current Endian craziness in Druid by standardizing on native-endian. https://github.com/druid-io/druid/issues/3892#issuecomment-276235357

Design

Based on DataSketches's Memory.
~~It is moved to Druid source tree, not used as a dependency.~~ See https://github.com/druid-io/druid/issues/3892#issuecomment-276589924
Memory object is immutable. "position and limit", if needed, are passed along with Memory as two primitive longs.
Upon creation a mutable Memory object has a cached immutable "view" object (which implements read methods and throws exception from write methods), this object is always returned when "as read only" is needed.
Explicit close() or free() on Memory is possible, but not strictly required, there is a sun.misc.Cleaner safety net, as well as in DirectByteBuffer.
While transition from ByteBuffer to Memory is in progress (~~not expected to be done all at once, but subsystem by subsystem, class by class~~ see https://github.com/druid-io/druid/issues/3892#issuecomment-276548114), and also when we need to interop with external libraries which require ByteBuffer, conversion from Memory to ByteBuffer and vice versa is possible. Likely it requires to
- Use DirectByteBuffer-compatible format of Cleaner inside Memory
- Be able to access private DirectByteBuffer constructors via MagicAccessorImpl.
+ "Global" (immutable) Memory's bounds are checked optionally via assertions/guarded by static final boolean flag, "local" position and limit are check explicitly manually, with helper methods or versions of read and write methods of Memory, which accept read/write position and "local" limits. https://github.com/druid-io/druid/issues/3892#issuecomment-276185704

Objections, additions, corrections, questions are welcome. @leerho @cheddar @weijietong @akashdw @himanshug @fjy @niketh

leerho commented 7 years ago

How about copyTo. In BB get() is used for primitives and getXXXArray() for getting into your local context. Neither get() or put() make sense here as you could be copying bytes out of the JVM entirely.

Buffers and endianness will be quite a bit of work. Ultimately, I'd like to create a set of pragmas and a generator so that they can all be created automatically. If I only start with a few methods, could you give me a list of which ones to start with?

leventov commented 7 years ago

copyTo() is good.

Actually for starting refactoring implementation is not needed, so endianness could be delayed, just get/put methods should be present (for Memory and Buffer), maybe with UnsupportedOperationException impl.

leerho commented 7 years ago

@leventov

wrap(ByteBuffer) should inherit endianness from the buffer.

Until I figure out how exactly I want to do endianness, the current Memory is only NE. So, for now, attempting to wrap a BE BB is an error, which it now checks.

leerho commented 7 years ago

@leventov @niketh

Consolidating Memory / MemoryImpl, WritableMemory / WritableMemoryImpl.

I haven't had any real reason to do this until now. However, after discussing with @niketh some of the issues he had to address when trying to implement the current DataSketches Memory into Druid I learned about some additional capabilities that would have been very helpful. One very important capability was:

A byte-level compareTo(other) or static compare( a, b ) for entire blocks of memory.

This enables a byte ordering on objects independent of datatype, which Druid uses a lot. Because of our multiple Impls, I don't want to have to generate all the combinations of compare(Memory, Memory), compare(Memory, WritableMemory), etc.

So here returns a root class that only knows about bytes, call it BaseBytes. It would have one field, MemoryState (which we could rename as BaseState). Both Memory, WritableMemory, Buffer, WritableBuffer (which now may as well be impls) would extend BaseBytes.

BaseBytes would have static methods that would just do byte operations, such as compare(BaseBytes a, BaseBytes b), copy( a, b), or even the possibility of Transform(a, b) ... as long as the operation doesn't need any information about the structure or type of data. Because BaseState would also be at that level, it can check for RO state and would know the base offsets, etc. All read-only, strictly byte oriented methods could also be moved to BaseBytes.

Even though BaseBytes is a common root class, it is not possible to cast from Memory to WritableMemory via BaseBytes. This is not caught at compile time, but it is caught at runtime.

Thoughts?

leventov commented 7 years ago

@leerho What about Memory.compareTo(offset, len, Memory other, otherOffset, otherLen)? You don't need any special API method or implementation compare(Memory, WritableMemory), because WritableMemory extends Memory, so you can always use the same method.

leerho commented 7 years ago

@leventov Look again. WritableMemory does not extend Memory. This is the two impl model. Also, in your snippet you only need one len.

leventov commented 7 years ago

@leerho it means that new objects are required to be created where WritableMemory is passed to a method, accepting read-only Memory, that makes the situation with the third goal in the very first message in this thread even worse that it used to be with ByteBuffers. With ByteBuffers, API encourages to create asReadOnly() copies "out of fear", but it was not required. With what you propose, it is simply required. I disagree with this.

Actually I didn't notice in your proposition in this message: https://github.com/druid-io/druid/issues/3892#issuecomment-284964809 that WritableMemory doesn't extend Memory. I disagree with this.

When WritableMemory extends Memory and all methods, that are not supposed to write, accept Memory, it's impossible to accidentally violate read/write satefy, you should intentionally cast Memory to WritableMemory (and even this could be hardly prohibited with a simple Checkstyle rule). On the contrary, it's super-easy to violate bounds safety (off-by-ones, wrong primitive argument, etc.) And yet we agree to not make bound checks by default (only with assertions enabled).

Read/write safety IMO is not a problem at all, as soon as there is a read-only superclass Memory, that ByteBuffer API lacks. Making the system even "more read/write safe" doesn't deserve even little sacrifices.

Not to mention that "WritableMemory not extending Memory" creates a lot of problems with code sharing, starting from the method that we are discussing, compareTo(). And a lot more methods: copyTo(), hash code computation, compression, object deserialization, etc.

leventov commented 7 years ago

Also, in your snippet you only need one len.

Sometimes you want to compare byte sequences of different lengths, as well as it's not prohibited to compare Strings of different lengths.

leerho commented 7 years ago

@leventov @niketh

All fixed. One impl. Cast to Memory from WritableMemory works. Both compareTo and copyTo have been implemented. @niketh is working on the Buffer / WritableBuffer impls.

leventov commented 7 years ago

@leerho thanks!

I see you decided to name WritableMemory's static factory methods with "writable" prefix. This is because you are concerned about overloading of Memory's methods? In this case I suggest to move them to Memory, because WritableMemory.writableMap() is a needless repetition. It could be Memory.writableMap().

leerho commented 7 years ago

@leventov

Yes, I was getting overloading errors, but the reason was because I still had WritableResouceHandler and ResourceHandler as separate classes from the previous scheme. By making WritableResourceHandler extend ResourceHandler (parallel to WritableMemory extend Memory) it fixed the overloading problem.

I have removed all the prefixes of writable except for one: writableRegion(offset, capacity). This method works off of an instance instead of the class. The call is myMem.writableRegion(...), so there is no repetition.

We could also make this a static method and then the calls would be WritableMemory.region(WritableMemory mem, long offset, long capacity) and Memory.region(Memory mem, long offset, long capacity). Then there would be no "writable" prefixes on method names.

This way of creating a region would then be virtually the same as if you just passed (Memory, offset, capacity) to a client, and let them do their own positioning. The latter does not create a new object but the client has a view of the total parent capacity. The former creates a new object wrapper, but limits the client to what they can see. There are use cases for both.

I do prefer accessing the Writable "constructor-type" methods from the WritableMemory class.

leventov commented 7 years ago

@leerho Thanks.

The current form: mem.writableRegion() is OK to me.

leerho commented 7 years ago

@leventov @niketh @cheddar @gianm @weijietong @AshwinJay

Development of the new memory architecture has been migrated from experimental/memory4 to its own, more visible repository memory in DataSketches.

I have completed the central Memory and WritableMemory implementation and first wave of unit tests, with coverage at about 95%. I think (hope) the API is fairly stable. I will try to put together a set of bullet points summarizing the features of the API and why some of the choices were made. Meanwhile, I look forward to any comments or suggestions you have.

@niketh and I will soon be focusing on a positional extension to this work.

I want to thank @leventov for his thoughtful contributions for much of this design.

leerho commented 7 years ago

@leventov @niketh @cheddar @gianm @weijietong @AshwinJay

The Memory and Buffer hierarchies are checked in to master. @niketh is working on more unit tests, especially for the Buffer hierarchy. Hopefully we can have a release to Maven Central this week. Please look it over.

leventov commented 7 years ago

@leerho you mean here: https://github.com/DataSketches/memory?

I think we completely agree on API, except that it misses (?) byteOrdering functionality, that is needed for making refactoring of Druid. Because we need to support many old formats which are big-endian.

I didn't review internal implementation details because actual refactoring of Druid and/or DataSketches with the new API may demonstrate that the new API is problematic in some ways and needs to be reworked. So I'm going to review implementation details of Memory after Druid refactoring PR.

leerho commented 7 years ago

you mean here: https://github.com/DataSketches/memory?

Yes.

It was recommended by both @cheddar and @niketh that the byte-ordering functionality is not essential, and that it was more important to get this package out, so that folks can start working with it. I have no plans to implement byte-ordering.

@niketh already has submitted a PR based on the original memory API and has a real good understanding of the implementation issues, and will be the one using this new API to resubmit a new PR based on it. Certainly if he runs into issues with the API we will make adjustments.

leventov commented 7 years ago

It was recommended by both @cheddar and @niketh that the byte-ordering functionality is not essential, and that it was more important to get this package out, so that folks can start working with it. I have no plans to implement byte-ordering.

This is one of the things that actual attempt of refactoring of Druid should verify. So yes, we can try to start refactoring without byteOrdering functionality and see if it works well.

leventov commented 7 years ago

@leerho BTW since Druid has now officially moved to Java 8, should Memory still support Java 7? I see https://github.com/DataSketches/sketches-core/ is also Java 8.

If it shouldn't, Memory and WritableMemory could be made interfaces, because interfaces support static methods in Java 8. If you want. However should be checked that performance is the same.

leerho commented 7 years ago

@leventov Performance degrades quite a bit with interfaces, unfortunately. Now that I have it working as abstract hierarchies, I'm not sure I want to change it.

Buffer, etc. is now working and with unit test coverage at 96%.

leventov commented 7 years ago

Netty found the same issue with interfaces: http://netty.io/wiki/new-and-noteworthy-in-4.0.html#bytebuf-is-not-an-interface-but-an-abstract-class

leventov commented 7 years ago

@cheddar @leerho @niketh as far as I can reason from public sources, this project is under development. According to https://github.com/druid-io/druid/issues/3892#issuecomment-276548114, could the query processing part be migrated first, and then the serde part? The processing part blocks #4422.

b-slim commented 6 years ago

My 2cents concerns

leerho commented 6 years ago

@b-slim

The facts are as follows:

The current internal implementation of Memory is heavily dependent on the Unsafe class, as many high-performance libraries do. However, the architecture of Memory has been designed so that a non-Unsafe implementation could be created without impacting the API.

Druid has not moved to JDK 9 yet, nor have many of the other systems that currently use the library. So there hasn't been a great deal of pressure to move to JDK 9, yet. Nonetheless, when the time comes we will move to 9, 10, 11 or whatever.

My comment in the memory docs that you highlighted is simply the truth. We have not had the time, resources or the requirement to move to JDK 9 or 10. So it obviously hasn't been tested to work against JDK 9 or 10 either. I'm clearly not going to guarantee code that hasn't been tested. And I don't plan to start extensive testing until it becomes a requirement.

There have been only 2 people heavily involved in the design and implementation of the Memory repository in DataSketches, @leventov and myself. And both of us are very busy people.

If you understand the value in the Memory API as @leventov and I do, then how about contributing a helping hand?

b-slim commented 6 years ago

how about contributing a helping hand?

@leerho it will be a great honor and learning experience to help. Am wondering by help you are referring to move Memory to be jdk9 compatible or refactor Druid code base to start using The Memory lib?

leerho commented 6 years ago

You could help by start doing some testing w/ Memory:

1) Do some testing w/ JDK9. What are the blockers? My understanding is that JDK9 allows access to Unsafe, but we have to add some code to access it. How and where do changes need to be made? My concern is that we will have to have a special code base for JDK9 that cannot be used with JDK8. Please investigate. You will need to create your own jars from master as the latest code has not been released to Maven Central yet. Although I hope it will be soon. 2) What about JDK10? Same questions. 3) Once you have some answers as to where it breaks and what we need to do, we can strategize on the best way to go forward. Don't submit any PR's, it is too early. If you want to show us code we can look at code on your own repo.

As a longer range contribution, you could investigate VarHandles and MethodHandles. Will they help at all? I'm not convinced from what I have read, but I have not played with them yet. If they look promising, you could do some detailed timing characterization and find out how they perform.

Then there is the OpenJDK Panama Project and JEP 191. These have been on the sideline for years, but if it they were ever adopted it would make life so much simpler for us. Do some digging and find out where they are headed and when! Contact John Rose and Charles Nutter... ask them!

You could become our migration expert !! :)

stale[bot] commented 5 years ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

stale[bot] commented 5 years ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

stale[bot] commented 5 years ago

This issue is no longer marked as stale.

stale[bot] commented 4 years ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

gianm commented 4 years ago

Let's keep this open, it's still interesting and relevant. Some recent work includes #9308 and #9314. IMO, as a next step, it'd be interesting to look at switching VectorAggregators and their callers to a Memory-based API.

stale[bot] commented 4 years ago

This issue is no longer marked as stale.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

apache / druid

Move from ByteBuffer to Memory #3892