xerial / larray

Large off-heap arrays and mmap files for Scala and Java
Apache License 2.0
400 stars 43 forks source link

Support Java 17 #75

Closed gortiz closed 1 year ago

gortiz commented 1 year ago

Apache Pinot uses this library to map large files in memory and the fact that LArray doesn't support Java 17 implies that Apache Pinot cannot run in modern JVMs.

We have been thinking on moving to more modern implementations of 64 bit buffers (like Chronicle Bytes) or to wait until Project Panama is fully supported, but both alternatives are not ideal.

As far as I know, the reason why this library cannot be used in Java 17 is because the DirectByteBuffer constructor has been changed (in order to add a new parameter related with Project Panama). Recently I've found that @xerial already updated a similar code in another library (link).

Would it be possible to apply the same trick here and release a new version of LArray?

See https://github.com/apache/pinot/issues/9162

xerial commented 1 year ago

I didn't know LArray is used in Pinot. Thanks for the information. I'll take a look at #76.

xerial commented 1 year ago

FYI https://github.com/wvlet/airframe/tree/master/airframe-canvas is supposed to be a successor of LArray, but it still doesn't support wrapping DirectByteBuffer. As you've found my workaround in msgpack-java, accessing native memory inside DirectBuffer will be tricky for Java 17.

To make a new release of LArray possible, I enabled Scala Steward for this repository to make this project up-to-date.

gortiz commented 1 year ago

Pinot needs the ability to create ByteBuffer views on memory owned by LArray because we send that view to other libraries (mainly RoaringBitmaps). The alternative would be either copied large amounts of memory or modify RoaringBitmaps to use an interface instead of ByteBuffer.

What we may be able to do is to use other library (I was trying Chronicle Bytes) and then create these views with our code. We cannot do that with LArray right now because UnsafeUtil includes both the reference to the sun.misc.Unsafe object and the DirectByteBuffer constructor. UnsafeUtil loads both attributes when the class is loaded. Given that most methods of LArray load UnsafeUtil to use the Unsafe object, they also try to find the DirectByteBuffer constructor. As it cannot be found, the loading of UnsafeUtil fails and that is a blocker even if we don't actually creates views of DirectByteBuffer with LArray.

gortiz commented 1 year ago

BTW, I've included #76 as a way to do solve the problem. If you feel more comfortable using the code you included in msgpack-java, it sounds good to me. I just didn't want to copy and paste your code there my myself as it was mine ;)

xerial commented 1 year ago

While working on #77, I also found the memory-mapped file implementation needs to be updated to support Java 17.

gortiz commented 1 year ago

Do you think you will be able to generate a new version compatible with Java 17 or do you feel that it isn't going to be worth your time? Can I help you with something?

xerial commented 1 year ago

@gortiz If it's ok to remove mmap support, I can finish #77. Or need a workaround for mmap errors:

[error] /Users/leo/work/git/larray/larray-mmap/src/main/java/xerial/larray/mmap/MMapBuffer.java:98:1: cannot find symbol
[error]   symbol:   class JavaIOFileDescriptorAccess
[error]   location: package sun.misc
[error] sun.misc.JavaIOFileDescriptorAccess
[error] /Users/leo/work/git/larray/larray-mmap/src/main/java/xerial/larray/mmap/MMapBuffer.java:98:1: cannot find symbol
[error]   symbol:   class SharedSecrets
[error]   location: package sun.misc
[error] sun.misc.SharedSecrets
gortiz commented 1 year ago

IIRC, we use xerial to do mmaps, so removing it would be dangerous. Anyway, in case more people use larray, I guess it would be a bad idea to release a new version that breaks compatibility.

xerial commented 1 year ago

Ok. Thanks for the clarification.

So the development order should be:

gortiz commented 1 year ago

Don't worry. This is not a priority to us. I hope Project Panama will be stable in the following releases, so we will be able to use that as an official solution instead of hacking with unsafe and method visibility.