apache / datasketches-memory

High performance native memory access for Java.
https://datasketches.apache.org
Apache License 2.0
114 stars 27 forks source link

Thoughts on "originMemory" #58

Closed leerho closed 6 years ago

leerho commented 6 years ago

Currently when asBuffer() is called it keeps a copy of the calling WritableMemoryImpl in the target WritableBufferImpl as a BaseWritableMemoryImpl originMemory. This seems a bit odd.

Thoughts?

leventov commented 6 years ago

It's impossible, because asBuffer() must create a new Buffer.

Buffer one = mem.asBuffer();
Buffer two = mem.asBuffer();

Must be two independent objects with independent positions and limits. Javadoc of asBuffer() specifies that.

This is why I don't quite like the name "asBuffer", because according to the Java method naming conventions it indeed gives an impression that the object, returned from this method, could be cached. It could be a source of confusion and programming mistakes. That is why I would prefer if this method was called "newBuffer", or "makeBuffer", or "createBuffer", or something like that.

leerho commented 6 years ago

How is it different from the ByteBuffer.asX() methods?

leventov commented 6 years ago

It's not different, but it is a defect of ByteBuffer's API as well. There are a lot of them in ByteBuffers, that was largely the motivation of creating the Memory library in the first place :)

leerho commented 6 years ago

How about "newBufferView()" or "createBufferView()" ?

What I don't like about newBuffer() is the association with new Buffer() which implies a new, virgin instance.

We would have to think about the reverse direction also: instead of "asMemory()", how about "getMemoryView()" ? This would especially make sense if we eliminated the Buffer.wrap(BB) static methods. Then the only way to get a Buffer would be to create a Memory first.

leventov commented 6 years ago

"newBufferView()" or "createBufferView()" sound OK to me. Another option is "asNewBuffer()".

I'm not sure there is any issue with "asMemory" because it in fact conforms to the Java naming conventions, it doesn't create (or caches) the Memory object.

leerho commented 6 years ago

... if we eliminated the Buffer.wrap(BB) static methods. Then the only way to get a Buffer would be to create a Memory first.

Any concerns about this?

leventov commented 6 years ago

I would keep those methods, because they save a fair bit of boilerplate for users, when they need to interop between ByteBuffer and Memory API. AFAIR I already used this method in Druid once.

leventov commented 6 years ago

What's the resolution about asBuffer naming?

leerho commented 6 years ago

I'm not sure there is any issue with "asMemory" because it in fact conforms to the Java naming conventions, it doesn't create (or caches) the Memory object.

If you directly create a Buffer via wrap(BB), then the method asMemory() will create a new Memory object with a view of this Buffer.

I have already removed all the other static methods of creating a Buffer. The only one remaining is wrap(BB). The "cost" would be to write Memory.wrap(BB).asBuffer() instead of Buffer.wrap(BB) (I am putting aside the exact naming issues for the moment).

Once we do that, then I would feel comfortable in removing all the fixed offset methods from Buffer, since the user would only have to write asMemory() to get access to the entire fixed offset API. This also means that the Buffer class would always have a valid reference to the originMemory.

Once we have done that, then the following naming (or similar) would be much clearer:

leventov commented 6 years ago

I'm OK with "asNewBufferView" and "getMemoryView", however I would prefer slightly asMemory(), and also I don't see the point of "View" suffixes.

Then, I would postpone non-native PR until positional get/put methods are removed in Buffer, because otherwise I need to implement them, although they are going to be removed soon.

leerho commented 6 years ago

It is not the relative “positional” methods that I would like to remove from Buffer, it the opposite, the fixed offset methods, because they are duplicates of what is offered in Memory.

I don't use Buffer much, but if you do and feel it is important to retain the duplicate functionality with Memory, speak your case. You still haven't said it is OK with you to remove the last static factory method: Buffer.map(BB).

leventov commented 6 years ago

I confused terms, by "positional" I meant "accepting a position", ie offset.

I suggested to keep wrap(): https://github.com/DataSketches/memory/issues/58#issuecomment-370186056. It's implementation could create an originMemory from the beginning and thus allow to make this field final.

leerho commented 6 years ago

I had a thought (a dangerous thing!) about a different way to shuffle the architecture of all these classes that may make some things easier: (This is just an idea and would need to be flushed out more!)

altmemoryhierarchy

On the left half we have our current simplified hierarchy. On the right half an alternate proposal. P = public, A = abstract, Itf = interface.

The significant change is making BaseBuffer, Buffer and WritableBuffer interfaces and allowing the Buffer Impl classes to directly inherit all the Memory methods.

The pros/cons of this new scheme would be:

OK you can tear it up :)

leerho commented 6 years ago

Then, I would postpone non-native PR until positional get/put methods are removed in Buffer, because otherwise I need to implement them, although they are going to be removed soon.

I have decided not to remove them now, so please go ahead with your PR.

If this new architecture can be made to work then all the fixed offset methods would be directly available anyway. When we do that, I will go through and remove what would be duplicate Overrides.

leventov commented 6 years ago

There are more problems this new architecture:

So I'm against this.

Although I don't like keeping absolute methods in Buffer, it's tolerable.