Reading from Java byte array with np.frombuffer

caspervdw commented 9 years ago

I am trying to interface bioformats from python, as part of the PIMS project (https://github.com/soft-matter/pims/pull/144). My experience with JPype up to now is very good, but there is one issue that I hope to optimize.

Bioformats inputs images into a JByte array and separately I can readout the datatype which can be any number. Because we potentially have to readout a lot of them, I want this step to be as fast as possible.

So I read https://github.com/originell/jpype/issues/71 and https://github.com/originell/jpype/pull/73 and came up with a few options:

from __future__ import division
import numpy as np
import jpype
jpype.startJVM(jpype.getDefaultJVMPath())

nbytes = 2*1024**2
a = jpype.java.nio.ByteBuffer.allocate(2*1024**2).array()
_source_dtype = '>u4'

def convert_JArray1(a):
    arr = np.array(np.frombuffer(unicode(a), dtype=np.uint16), dtype=np.uint8)
    return np.frombuffer(buffer(arr), dtype=_source_dtype)

def convert_JArray2(a):    
    java_str = jpype.java.lang.String(a, 'ISO-8859-1')
    arr = np.array(np.frombuffer(java_str.toString(), dtype=np.uint16), dtype=np.byte)
    return np.frombuffer(buffer(arr), dtype=_source_dtype)

def convert_JArray3(a):
    return np.frombuffer(np.array(a, dtype=np.byte), dtype=_source_dtype)

%timeit convert_JArray1(a) # 100 loops, best of 3: 9.14 ms per loop
%timeit convert_JArray2(a) # 100 loops, best of 3: 7.9 ms per loop
%timeit convert_JArray3(a) # 1 loops, best of 3: 14 s per loop

My questions are:

am I doing something wrong in the 3rd option?
the second one works on my Win64 platform; on Travis it fails. Maybe something with endianness but I can't figure it out. Does anyone have a suggestion? (edit: with fails, I mean, the returned array has the wrong shape)

I am using jpype 0.5.7 on py2.7. Thanks in advance for your response!

marscher commented 9 years ago

If you use the Numpy interface to get buffers (eg. byte[] arrays from JVM) it should be fast.

In [4]: a = jpype.java.nio.ByteBuffer.allocate(21024*2).array()

In [5]: a Out[5]: <jpype._jarray.byte[] at 0x7f50f00c4e10>

In [6]: %timeit a[:] 1000 loops, best of 3: 627 µs per loop

marscher commented 9 years ago

the third option is so prohibitive slow because a function is being called for every element in the buffer.

caspervdw commented 9 years ago

Thanks for the reply! Apparently I didn't understand the working of the numpy interface. However, on my installation a[:] gives a list and not a numpy array. If I do np.array(a[:]), there is a large speed increase (14 s to 0.4 s) but still much slower compared to the sub-millisecond speed you report.

I just got the new version 0.6.0 from pypi, still with the same results. Could this have to do something with my C compiler (msvc) or platform (win64)?

marscher commented 9 years ago

Am 13.04.2015 um 10:07 schrieb Casper van der Wel:

Thanks for the reply! Apparently I didn't understand the working of the numpy interface. However, on my installation |a[:]| gives a list and not a numpy array. If I do |np.array(a[:])|, there is a large speed increase (14 s to 0.4 s) but still much slower compared to the sub-millisecond speed you report.

I just got the new version 0.6.0 from pypi, still with the same results. Could this have to do something with my compiler (msvc) or platform (win64)?

— Reply to this email directly or view it on GitHub https://github.com/originell/jpype/issues/133#issuecomment-92261296.

If you have Numpy installed during setting up jpype, the Numpy extension, which gives the speed advantage is being compiled. Otherwise it will fallback to lists which are deadly slow.

In [3]: a = jpype.java.nio.ByteBuffer.allocate(21024*2).array()

In [4]: a +Out[4]: <jpype._jarray.byte[] at 0x7f140a4b7110>

In [5]: type(a[:]) Out[5]: numpy.ndarray

caspervdw commented 9 years ago

So there is something wrong with my installation. I did get the notification:

Turned ON Numpy support for fast Java array access

I just rebuilt it from the 0.6.0 source. The unittests do not work on my system, I am missing some jars... is there anything else that I could check?

marscher commented 9 years ago

If you get this message during setup, you should be fine.

Is the type of a[:] really list?

caspervdw commented 9 years ago

Yes it is really a list, type(a[:]) returns list and not numpy.ndarray. Is there some flag I can set somewhere to get numpy behaviour?

marscher commented 9 years ago

Am 13.04.2015 um 16:09 schrieb Casper van der Wel:

Yes it is really a list, |type(a[:])| returns |list| and not |numpy.ndarray|.

— Reply to this email directly or view it on GitHub https://github.com/originell/jpype/issues/133#issuecomment-92368973.

Can you try downgrading to 0.5.7?

caspervdw commented 9 years ago

Still same issue

marscher commented 9 years ago

very strange. Which python/numpy version are you using?

caspervdw commented 9 years ago

Currently python 2.7.8 with numpy 1.9.1.

sys.version
> '2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Jul  2 2014, 15:12:11) [MSC v.1500 64 bit (AMD64)]'

caspervdw commented 9 years ago

And now also tested with numpy 1.9.2, same issue.

marscher commented 9 years ago

seems like you have discovered a bug! Sorry! In a few days I will have time to investigate.

caspervdw commented 9 years ago

No problem, thanks for the help :) If you need anything please let me know, although I cannot be of great help on the C or java side.

caspervdw commented 9 years ago

I just tried it on Python 3.4, win64 and got the same issue. It really seems to have to do with the platform.

I did develop a workaround that is compatible with both Python 2.7 and 3.4:

def convert_jbyte_stringbuffer(a):
    Jstr = jpype.java.lang.String(arr, 'ISO-8859-1').toString().encode('UTF-16LE')
    bytearr = np.array(np.frombuffer(Jstr, dtype='<u2'), dtype=np.byte)
    return np.frombuffer(bytearr, dtype=dtype)

marscher commented 9 years ago

actually I have not tested this feature on Windows (despite on Appveyor cloud). There is a bug in byte conversion: https://ci.appveyor.com/project/marscher/jpype-555/build/tests

Will have a look soon, why this is failing.

marscher commented 9 years ago

maybe the byte datatype on windows differ the one of numpy/java

michael-betz commented 9 years ago

Hi, this might or might not be a related topic: when I try to convert a 2D JArray to numpy, I also get a list() and the conversion is several orders of magnitude slower than for a 1D array of the same size. I work on linux 64 bit and Python 3.4.3 :: Anaconda 2.1.0 (64-bit)

Here is an example

# 1D case is fast
sourceDat = arange( 1000000, dtype=double )
a = JArray( JDouble, sourceDat.ndim )( sourceDat.tolist() )
%timeit array(a[:])
#100 loops, best of 3: 4.24 ms per loop

# 2D case is very very slow
sourceDat = arange( 1000000, dtype=double ).reshape((1000,-1))
a = JArray( JDouble, sourceDat.ndim )( sourceDat.tolist() )
%timeit array(a[:])
# 1 loops, best of 3: 18.9 s per loop

Note that if I use a[:] on the 2D array, I get a list with 1000 jpype._jarray.double[] elements.

I would love to have the [:] shortcut extended to 2D arrays :)

Cheers Michael

marscher commented 9 years ago

In Java a n-d array, where n>1, is an object array on all despite the last index. So the code handling the conversion for objects is triggered, which is very slow.

If we put effort into this, I would like to implement it for n-d case. We would then unwrap until we have 1d arrays and assign them to a result array of the right type and dimension.

The same thing applies for assignments.

michael-betz commented 9 years ago

Hi,

thanks, your explanation makes complete sense! I also agree that the [:] conversion should work for for any kind of n-d arrays.

Your idea inspired me to do a really nasty hack for 2D arrays, which seems to improve performance quite a bit:

def convert2DJarrayToNumpy( jArr ):
    arrShape = ( len(jArr), len(jArr[0]) )
    arrType = type( jArr[0][0] )
    resultArray = empty( arrShape, dtype=arrType )
    for i,cols in enumerate( jArr[:] ):
        resultArray[i,:] = cols[:]
    return resultArray

# 2D case, not so slow
sourceDat = arange( 1000000, dtype=double ).reshape((1000,-1))
a = JArray( JDouble, sourceDat.ndim )( sourceDat.tolist() )
%timeit convert2DJarrayToNumpy( a )
# 10 loops, best of 3: 33.4 ms per loop

caspervdw commented 8 years ago

I checked out #167 and can confirm that the conversion now works for ND arrays (Py3.4, 64-bit). Thanks alot!

My working code is now:

im = np.frombuffer(java_array[:], dtype=pixel_type)
im.shape = frame_shape

Closing as the issue has been solved.

edit correction I mean PR #164 instead of #167

Kjos commented 5 years ago

Seems the issue is still prevalent with return parameters.

    public static byte[] create(float multiplier, int frames, byte[] img, byte[] imd, int w, int h) {
        return new byte[frames*h*w*3];
    }

Below is very quick:

start_time = time.time()
package = JPackage('net').kajos.test
da = JArray(JByte,1)(rimd.flatten().tolist())
ia = JArray(JByte,1)(rimg.flatten().tolist())
result = YourClass.create(0.2, frames, ia, da, shape[1], shape[0])
print("Java --- %s seconds ---" % (time.time() - start_time))

Concat that code with the following and this part takes forever:

im = np.zeros((len(result),))
for i in range(size):
    im[i] = result[i]
im.shape = (frames, h, w, 3)

print("Java --- %s seconds ---" % (time.time() - start_time))

I'm sorry apparently I was on another branch of JPype. I switched to this branch and all seems to work fine from what I can tell.

vwxyzjn commented 4 years ago

Please consider reopening the issue; the problem is still here with the latest version:

data = np.random.rand(10,10,10)
a = JArray(JDouble, data.ndim)(data.tolist())
def convert3DJarrayToNumpy(jArray):
    # get shape
    arr_shape = (len(jArray),)
    temp_array = jArray[0]
    while hasattr(temp_array, '__len__'):
        arr_shape += (len(temp_array),)
        temp_array = temp_array[0]
    arr_type = type(temp_array)
    # transfer data
    resultArray = np.empty(arr_shape, dtype=arr_type)
    for ix in range(arr_shape[0]):
        for i,cols in enumerate(jArray[ix][:]):
            resultArray[ix][i,:] = cols[:]
    return resultArray, arr_shape, arr_type

%timeit convert3DJarrayToNumpy(a)
# 742 µs ± 2.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.array(a[:])
# 5.11 ms ± 103 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
print(jpype.__version__)
# 0.7.1

Thrameos commented 4 years ago

Done. I have been making a lot of speed improvements, but that is targeted for JPype 0.8 series not JPype 0.7.x. But I do not believe I have yet gotten to numpy integration yet.

vwxyzjn commented 4 years ago

@Thrameos Wow that was quick. Do you have the branch name for me to try on maybe?

Thrameos commented 4 years ago

Please look over #528 and #502 for what improvements we have made recently of speed. I am not sure it covers this current request, but it was a speed improvement of about a factor of 3-4 on most of the primary paths for calling and returns. I have plans to revisit the numpy as well as part of 0.8 but there are other structural tasks that are ahead of that.

Thrameos commented 4 years ago

I have analyzed the code you sent. The problem is that np.array is treating the Java array using the sequence API. Thus it is making a huge number of calls to __next__ and __getitem__. The issue isn't really on the JPype side (though there can be some improvement.)

So I will have to do some research to see what other APIs the np.array supports. If I implement a more direct API then we should achieve greater speed.

Thrameos commented 4 years ago

The relevant API is PyArray_GetArrayParamsFromObject

https://stackoverflow.com/questions/40378427/numpy-formal-definition-of-array-like-objects

vwxyzjn commented 4 years ago

@Thrameos thanks for the incredible work. Does that API mean the improvement has to be done on the java or c++ side?

Thrameos commented 4 years ago

It will require modifications to C++ side (and minor modifications to Python wrappers). It may be possible in JPype 0.7 though likely easier in JPype 0.8. In JPype 0.8 all types derive from C types directly so they are much faster rather than indirect like current JPype objects.

The options are Python memoryview API, __array_instance__ API, the __array__ API, or directly deriving JPype types from numpy array. I have to figure out which is the most straightforward to work with and then I can give a time estimate.

vwxyzjn commented 4 years ago

Thanks a lot. Looking forward to it :)

Thrameos commented 4 years ago

I am unsure if any of them will give you a useable API for what you are looking for. (at least without a bit of additional code) You want a multidimensional Java array of primitives to automatically convert to an appropriately numpy array of primitives.

The first big challenge Java arrays are not guaranteed to be square thus multidimensionally arrays are technically just Object[] where the Object type changes with the type/dimensions. Pure Object types in numpy tend to do shallow copies of all but square arrays. At least in the testing I have conducted thus far. I could in principle have it automatically look at the array and determine if it could be made into an n dimensional and select the result appropriately. But that would mean a huge inconsistency as I would have to test every dimension in order to determine the type.

Of the APIs available. Memoryview offers nothing but pure data types thus could never represent a non-square Java array, __array_struct__ is strictly a capsule type interface, and thus __array_interface__ appears like the only workable one but it produces shallow copies unless I do something tricky in arbitrarily deciding when it should be Object[] or to flatten to a rectangular data structure. I can with a little effort expose all primitive types as simple types and multidimensional arrays as type "|O". But that means that you would still need a special copy in order to actually pull all the data over.

I still need some research before I understand all the issues.

Thrameos commented 4 years ago

Of course it is not like numpy does not already have these ambiguities.

import numpy as np
import jpype as jp

jp.startJVM()

jd = jp.JArray(jp.JDouble,2)(4)
jd[0] = jp.JArray(jp.JDouble)([1])
jd[1] = jp.JArray(jp.JDouble)([1,2])
jd[2] = jp.JArray(jp.JDouble)([1])
jd[3] = jp.JArray(jp.JDouble)([1,2,3])

print(np.array(jd[:]))

jd = jp.JArray(jp.JDouble,2)(4)
jd[0] = jp.JArray(jp.JDouble)([1,1])
jd[1] = jp.JArray(jp.JDouble)([1,2])
jd[2] = jp.JArray(jp.JDouble)([1,3])
jd[3] = jp.JArray(jp.JDouble)([1,4])

print(np.array(jd[:]))

This first set gives an array with 4 JDouble[] elements, the second gives a rectangular array of doubles. So perhaps scanning for a square is the appropriate solution.

Thrameos commented 4 years ago

Okay so best guess of the required effort.

[x] Add isRectangular() to JPArray so that decide what type to expose for sizes up to 3. (Perhaps can be in Java)
[x] Add dimension list extraction to JPArray for rectangular (could be in Java)
[ ] Add __array_interface__ to PyJPArray (exposed to JArray from C++)
[ ] For ragged arrays expose as "|O" by creating JArray instances for each object in the array
[x] For rectangular arrays implement C++ pull routine to flatten the array first and then pull all data and exposing it. (Perhaps portion can be in Java, but that would double memory requirements as both C++ and Java copies would be created.)
[ ] Special handling for Object vs Primitive types
[x] Test bench for each of the cases.

Likely at least a week of effort given the number of routines required. Not sure yet when I can book time for that level of effort.

Thrameos commented 4 years ago

I think I will need to handle this one in stages. If I only implement that __array_interface__ then it will only benefit numpy and nothing else. But if I implement the buffer protocol then it should work everywhere including numpy. Unfortunately, the buffer protocol can only be implemented on CPython internal class rather than Python classes. Thus implementing it in JPype 0.7 is pretty much out as we can't extend our Python JArray class to support that due to object tree. Thus the only option is the other protocol which means basically at minimum twice of much work as I have to make a different implementation in 0.7 and 0.8 (likely much more as the point of the 0.8 rewrite was to streamline development so doing 0.7 work is usually much worse).

JPype 0.8 is another story for speed purposes all of the wrapper classes have been moved back to CPython. Thus adding a buffer protocol is just a matter of adding Py_tp_as_buffer to the implementation list and implementing the two required functions. I can even make it write through to Java in many cases so we can support better integration assuming the reference locks on the buffers can be held open. The tricky part is that this interface can only appear on 1D arrays of primitives. So there will need to be some hacking on the type tree to make PyJPArray and PyJPArrayPrimitive as two different types in the internal module. Thus I would get a great deal more leverage if I implement this in 0.8 and the development time would be faster as it is just adding an object class and 2 hooks.

This will improve the speed for your 3d example a bit as we loose one dimension of iterators off the bat (the other two dimensions would still be walked, but there are other ways to speed that up). It may not be as fast as the __array_interface__ for multidimensional, but I can add that API as well on second pass if needed. And in terms of work load it would not interfere with the current schedule.

My roadmap (mostly as seen in the project page) is

[X] Finish the JPype 0.8 core rewrite (works for me, but the testbench is having issues)
[ ] Complete test coverage of 0.7 so that I can verify the 0.8 core rework is fully working
[ ] Complete the feature list for 0.8 (most are in progress already)
[ ] Complete documentation

I am currently on schedule for an alpha release of JPype 0.8 series around early April. Unfortunately that means I likely won't have a fix for this thread released until around June.

vwxyzjn commented 4 years ago

That makes sense. Thank you for the great work.

Get Outlook for iOShttps://aka.ms/o0ukef

From: Karl Nelson notifications@github.com Sent: Monday, January 6, 2020 11:56:59 AM To: jpype-project/jpype jpype@noreply.github.com Cc: Costa Huang costa.huang@outlook.com; Comment comment@noreply.github.com Subject: Re: [jpype-project/jpype] Reading from Java byte array with np.frombuffer (#133)

I think I will need to handle this one in stages. If I only implement that __array_interface__ then it will only benefit numpy and nothing else. But if I implement the buffer protocol then it should work everywhere including numpy. Unfortunately, the buffer protocol can only be implemented on CPython internal class rather than Python classes. Thus implementing it in JPype 0.7 is pretty much out as we can't extend our Python JArray class to support that due to object tree. Thus the only option is the other protocol which means basically at minimum twice of much work as I have to make a different implementation in 0.7 and 0.8 (likely much more as the point of the 0.8 rewrite was to streamline development so doing 0.7 work is usually much worse).

JPype 0.8 is another story for speed purposes all of the wrapper classes have been moved back to CPython. Thus adding a buffer protocol is just a matter of adding Py_tp_as_buffer to the implementation list and implementing the two required functions. I can even make it write through to Java in many cases so we can support better integration assuming the reference locks on the buffers can be held open. The tricky part is that this interface can only appear on 1D arrays of primitives. So there will need to be some hacking on the type tree to make PyJPArray and PyJPArrayPrimitive as two different types in the internal module. Thus I would get a great deal more leverage if I implement this in 0.8 and the development time would be faster as it is just adding an object class and 2 hooks.

This will improve the speed for your 3d example a bit as we loose one dimension of iterators off the bat (the other two dimensions would still be walked, but there are other ways to speed that up). It may not be as fast as the __array_interface__ for multidimensional, but I can add that API as well on second pass if needed. And in terms of work load it would not interfere with the current schedule.

My roadmap (mostly as seen in the project page) is

Finish the JPype 0.8 core rewrite (works for me, but the testbench is having issues)
Complete test coverage of 0.7 so that I can verify the 0.8 core rework is fully working
Complete the feature list for 0.8 (most are in progress already)
Complete documentation

I am currently on schedule for an alpha release of JPype 0.8 series around early April. Unfortunately that means I likely won't have a fix for this thread released until around June.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jpype-project/jpype/issues/133?email_source=notifications&email_token=ABKMJEZ7HEKAZ7IRQ7KY2ILQ4NPFXA5CNFSM4A7VKUWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIGBVRQ#issuecomment-571218630, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABKMJE2AD7R24HGMJKIYLADQ4NPFXANCNFSM4A7VKUWA.

Thrameos commented 4 years ago

Just wanted to update you on progress thus far. I had some issues with numpy that required more urgent upgrades and needed to backport the speed patch for the 0.7 series. With that work complete I can now resume work on the array transfer capabilities.

Thrameos commented 4 years ago

Completed the first pass for 2d arrays. It was a decent speed up but still not matching the direct method, so I will have to implement another dimension before I can see how much this is helping.

vwxyzjn commented 4 years ago

Hi @Thrameos, thanks for keeping me posted. Looking forward to the resolution.

Thrameos commented 4 years ago

Okay I have the bench marking complete on my sad, sad laptop.

JPype 0.7.1

# Reference copy convert3DJarrayToNumPy(jarray)
1.14 ms ± 53.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# np.array(jarray[:])
9.81 ms ± 1.19 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

JPype 0.7.2 (using speed patch from JPype 0.8)

# Reference copy convert3DJarrayToNumPy(jarray)
881 µs ± 134 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# np.array(jarray)
2.09 ms ± 245 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

JPype 0.7.2 + multidim array acceleration

# Reference copy convert3DJarrayToNumPy(jarray)
839 µs ± 91.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# np.array(jarray)
29.7 µs ± 699 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Okay so 29.7 is bigger than 9.81 so I have clearly failed. Oh wait I guess the units matter.
It is about 326 times faster now.

I also fixed the unreported bug that was requiring "tolist()" to be called to perform the conversion. There was a reference count issue that was accessing a sequence item after the resource was destroyed.

vwxyzjn commented 4 years ago

@Thrameos That's amazing. Thank you so much. I doubt the array conversion will be my bottleneck now.

Thrameos commented 3 years ago

Just a follow up to future readers. frombuffer is the equivalent of a reinterpret cast and disregards the buffers reported format type. Though there are cases in which reinterpreting a Java array may be useful, it shouldn't be used when elementwise copy of an array is desired. Depending on the type for the reinterpretation cast and the size there have been reports of issues though these have unfortunately not be possible to replicate.

Java arrays have the buffer protocol and should be able to convert simply by calling np.array as opposed to np.frombuffer. Even multidimensional arrays can be transferred this way and use optimized paths for rectangular arrays up a depth of 5.

jpype-project / jpype

Reading from Java byte array with np.frombuffer #133

Here is an example