google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.83k stars 5.09k forks source link

Android/Java PoseLandmarker unnecessary []byte garbage #5624

Open calbot opened 1 week ago

calbot commented 1 week ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Android

MediaPipe Tasks SDK version

No response

Task name (e.g. Image classification, Gesture recognition etc.)

PoseLandmarker

Programming Language and version (e.g. C++, Python, Java)

Kotlin/Java

Describe the actual behavior

AndroidPacketGetter.copyRgbToBitmap calling ByteBuffer.allocateDirect creating massive amounts of byte array GC

Describe the expected behaviour

Reuse byte[] on the same thread in AndroidPacketGetter.copyRgbToBitmap

Standalone code/steps you may have used to try to get what you need

AndroidPacketGetter is misusing ByteBuffer.allocateDirect which is meant for long lived ByteBuffers. It seems to be a major memory allocation/GC area for PoseLandmarker.

Maybe these should be threadlocal cached by size (width * height) or some other way to avoid so many allocations. Or maybe the byte buffer should be passed in from PoseLandmarker.createFromOptions where AndroidPacketGetter.getBitmapFromRgb is called.

Other info / Complete Logs

No response

calbot commented 1 week ago

Here's an idea for potential improved code which reduces ByteBuffer.allocateDirect calls and reuses the buffer... Untested

  private static ThreadLocal<ByteBuffer> threadLocalBuffer = new ThreadLocal<>();

  private static void copyRgbToBitmap(Packet packet, Bitmap mutableBitmap, int width, int height) {
      int bufferSize = width * height * 4;
      ByteBuffer buffer = threadLocalBuffer.get();
      if (buffer == null || buffer.capacity() < bufferSize) {
          buffer = ByteBuffer.allocateDirect(bufferSize);
          threadLocalBuffer.set(buffer);
      } else {
          buffer.clear();
      }

      PacketGetter.getRgbaFromRgb(packet, buffer);
      buffer.position(0);
      mutableBitmap.copyPixelsFromBuffer(buffer);
  }
calbot commented 2 days ago

Also, this to reduce time spent in GC

  private static final ThreadLocal<Bitmap> threadLocalBitmap = new ThreadLocal<>();

  /**
   * Gets an {@code ARGB_8888} bitmap from an RGB mediapipe image frame packet.
   *
   * @param packet mediapipe packet
   * @return {@link Bitmap} with pixels copied from the packet
   */
  public static Bitmap getBitmapFromRgb(Packet packet) {
    int width = PacketGetter.getImageWidth(packet);
    int height = PacketGetter.getImageHeight(packet);

    // Get the thread-local bitmap
    Bitmap bitmap = threadLocalBitmap.get();

    // If the bitmap is null or the size doesn't match, create a new one
    if (bitmap == null || bitmap.getWidth() != width || bitmap.getHeight() != height) {
        bitmap = Bitmap.createBitmap(width, height, Bitmap.Config.ARGB_8888);
        threadLocalBitmap.set(bitmap);
    }

    // Clear the bitmap content for reuse
    bitmap.eraseColor(android.graphics.Color.TRANSPARENT);

    // Fill the bitmap with the data
    copyRgbToBitmap(packet, bitmap, width, height);
    return bitmap;
  }