touilleMan / godot-python

Python support for Godot 🐍🐍🐍
Other
1.87k stars 142 forks source link

Conversion of Python (NumPy) array to Godot PoolByteArray is slow #305

Closed kb173 closed 3 years ago

kb173 commented 3 years ago

We'd like to convert Matplotlib renderings into Godot textures. We got this working (see https://github.com/boku-ilen/landscapelab/blob/master/Python/PythonTest.py), but the performance is pretty bad. Specifically, this line:

pool_array = PoolByteArray(array.flatten())

which turns a NumPy array of size 3686400 into a Godot PoolByteArray, takes 500ms on average. I realize that it's a large array, but it seems like too big of a bottleneck - it's the most time-consuming operation of the entire script by far.

I initially suspected that the problem lies somewhere around here: https://github.com/touilleMan/godot-python/blob/master/generation/builtins_templates/array.tmpl.pxi#L44 My first idea was to modify it to self.resize() once and then set the elements by index, rather than appending, but this caused no difference in performance (which I found surprising). I tried the same in https://github.com/touilleMan/godot-python/blob/master/generation/pool_arrays_templates/pool_x_array.tmpl.pyx#L32 with a similarly non-significant effect.

However, I later realized that the problem seems to be inherent in setting data in any Godot Array one by one: when I iterate over 3686400 elements and put them into a PoolByteArray in GDScript, I get the same timings of around 500ms. If I instead construct a PoolByteArray out of a Godot Array (again in GDScript) directly, it takes <100ms on average.

Thus, it seems like the only way to drastically improve performance here would be to do a more direct memory-copy of a NumPy Array into a Godot Array or PoolByteArray. I don't know much about under-the-hood memory management in Python and NumPy; does anyone here have an idea of if and how this could be accomplished? Alternatively, maybe there's some indirection here that could be improved by adding more specialized code into the Cython codebase of the addon?

TL;DR: Manually constructing a PoolByteArray by setting each element is slow, both in GDScript and in PythonScript. Is there a way to do a more direct memory-copy of a NumPy Array into a Godot Array or PoolByteArray?

Thank you for the great plugin by the way! Even if the performance issues persist, being able to use Python libraries like Matplotlib in Godot is really powerful.

touilleMan commented 3 years ago

Hi @kb173

Have you tried this ?

How can I efficiently access PoolArrays?

PoolIntArray, PoolFloatArray, PoolVector3Array and the other pool arrays can't be accessed directly because they must be locked in memory first. Use the arr.raw_access() context manager to lock it::

  arr = PoolIntArray() # create the array
  arr.resize(10000)

  with arr.raw_access() as ptr:
      for i in range(10000):
          ptr[i] = i # this is fast

  # read access:
  with arr.raw_access() as ptr:
      for i in range(10000):
          assert ptr[i] == i # so is this

Keep in mind great performances comes with great responsabilities: there is no boundary check so you may end up with memory corruption if you don't take care ;-)

See the godot-python issue.

kb173 commented 3 years ago

Hi! Yes I have, but the performance of that is similar to constructing the PoolByteArray with the Python array as a constructor argument. I assume that's because the same thing happens in the background when doing that. That bottleneck might be unavoidable when iterating over 3686400 individual elements.

Hence I think that the only more performant way would be to copy chunks of memory rather than single elements - ideally something like memcpying the entire array in C. But I don't know whether that's possible, since Python's array storage format might be too different from that of Godot's PoolArrays...

touilleMan commented 3 years ago

As you say, the key is to have your data in C arrays, so they should be compatible between Godot and numpy.

I don't know much about numpy, but I think you should try numpy.ctypeslib.as_array which allows you to create a numpy array from a pointer (without owning the memory, so no risk of double free error)

Godot's pool array is basically a way to have dynamically resizable C array with copy-on-write (see https://github.com/godotengine/godot/blob/e16f264a7e0bec6acd07284a1fc9c256fc483368/core/pool_vector.h#L41-L87).

So I guess you should be able to do something like (didn't try this code, so most likely full of bugs ^^):

arr = PoolByteArray() # create the array
arr.resize(10000)
with arr.raw_access() as ptr:
    np_array = numpy.ctypeslib.as_array(ptr.get_address())
    np_array[1] = 42
# Using np_array here is undefined behavior !!!!
kb173 commented 3 years ago

As far as I can tell your code snippet goes the other direction (PoolByteArray to NumPy Array), but that did point me into the right direction. I think I'm getting close; this is what I have:

pool_array = PoolByteArray()
pool_array.resize(width * height * 3)
with pool_array.raw_access() as ptr:
    ptr = np.ascontiguousarray(array.flatten(), dtype = np.uint8).__array_interface__['data'][0]

Some of the things in that last line might be unnecessary, I was just trying things out. However, this still results in a black image :/ I think it's because I'm replacing the PoolByteArrayWriteAccess object with the plain address. I see that it has get_address(), but no set_address(). I tried implementing something like that and came up with this monstrosity:

def set_address(self, new_address):
    self._gd_ptr = <{{ t.gd_value }} *><uintptr_t>new_address

This does seem to set the _gd_ptr to what it should be! But the result is still a black image. So I guess something might still be wrong with the np.ascontiguousarray(array.flatten(), dtype = np.uint8).__array_interface__['data'][0] line. Maybe something is just going out of scope and being deleted prematurely, so the pointer becomes invalid?

I feel like we're quite close, so if you have any ideas I'd really appreciate some more hints!

touilleMan commented 3 years ago

I don't think there is a way to tell PoolByteArray to change it internal memory buffer (the whole point of pool array being to handle memory buffer for us by providing copy on write and reallocation when needed)

As a matter of fact, the ptr = np.ascontiguousarray(...) doesn't change the pool array at all (because it is modifying the ptr variable within your python function, not the actual pointer stored in the pool array). That why your image is still black: nothing have changed :smile:

I really think you should go the other way around (using memory buffer provided by the pool array) given numpy is used to work with C library where accessing an arbitrary memory buffer is a very common task ;-)

kb173 commented 3 years ago

Ahh, I misunderstood your code example - you're right, going at it from that direction does make sense.

After some playing around with it, I got it to work! \o/ here's my full code for converting a Matplotlib canvas to a Godot image:

def canvas_to_godot_image(self, canvas):
    image = Image()

    width = canvas.get_width_height()[1]
    height = canvas.get_width_height()[0]

    size = width * height * 3

    pool_array = PoolByteArray()
    pool_array.resize(size)

    with pool_array.raw_access() as ptr:
        numpy_array = np.ctypeslib.as_array(ctypes.cast(ptr.get_address(), ctypes.POINTER(ctypes.c_uint8)), (size,))
        numpy_array[:] = np.frombuffer(canvas.tostring_rgb(), dtype=np.uint8).reshape(canvas.get_width_height()[::-1] + (3,)).flatten()

    image.create_from_data(height, width, False, Image.FORMAT_RGB8, pool_array)
    return image

This is indeed 4x faster than the naive approach of constructing the PoolByteArray directly like pool_array = PoolByteArray(np.frombuffer(canvas.tostring_rgb(), dtype=np.uint8).reshape(canvas.get_width_height()[::-1] + (3,)).flatten())! (Maybe it could be sped up even more by using something more low-level than numpy_array[:] =, but from what I gathered, that method is quite well optimized.)

I wonder if something like this code snippet could actually be used in godot-python's PoolByteArray constructor rather than the current implementation, at least if the input is a Numpy array - that would spare people from doing this optimization by hand in the future. I guess you could check for the type of the passed array and, if it is a Numpy array, do the whole np.ctypeslib.as_array and np.frombuffer stuff rather than the default implementation, right? If that makes sense to you, I can try to open a pull request.

Thank you for your help - this does make it fairly suitable for real-time use :)

touilleMan commented 3 years ago

I wonder if something like this code snippet could actually be used in godot-python's PoolByteArray constructor rather than the current implementation, at least if the input is a Numpy array - that would spare people from doing this optimization by hand in the future. I guess you could check for the type of the passed array and, if it is a Numpy array, do the whole np.ctypeslib.as_array and np.frombuffer stuff rather than the default implementation, right? If that makes sense to you, I can try to open a pull request.

I'd rather not add numpy stuff onty godot-python as it would make things more complex and slower (especially if numpy is not installed by user). This is the kind of thing that is really specific to one's workflow so it's better to provide low-level tools and documentation instead ;-)

TheBricktop commented 2 years ago

Considering that lots of people here are using the python extension exaclty for AI ML and CV some performant numpy support might be very usefull, maybe this should be put into separate tools addon like godot-python-contrib ?

touilleMan commented 2 years ago

@TheBricktop since last year my position has changed on this topic: as you say Python for Godot seems to be mostly used by people doing science-related stuff, so I guess having a good numpy integration makes sense.

Regarding what can be done on this topic, I won't merge anything for Godot 3 given I don't have the bandwidth for it (I dedicate all my time on the Godot 4 support that will make Godot-Python better than anything before :smile: )

So the best thing is to wait for a first release of Godot-Python for Godot 4 and provides PRs for it.

In the meantime it would be great to open an issue about numpy integration to have people list they usecases and needs (I'm myself not a numpy user so I'd definitely need this to understand what features are useful)

TheBricktop commented 2 years ago

I might open an issue that would collect suggestions and ideas considering usage of numpy.