google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.49k stars 5.15k forks source link

Convert android.media.Image (YUV_420_888) into mediapipe.framework.TextureFrame or bitmap for ImageSolutionBase.send() #3916

Closed MasterHansCoding closed 1 year ago

MasterHansCoding commented 1 year ago

Do you know how to do this?

ayushgdev commented 1 year ago

Hello @MasterHansCoding Please refer the following pseudocode to convert android.media.Image to BitMap:

1. From the image, create buffers for each plane (Y,U,V)
2. Get row and pixel strides for Y plane
3. Get row and pixel strides for U/V plane (The U/V planes are guaranteed to have the same row stride and pixel stride.)
4. Iterate over image. For each iteration:
    4.1. Get Y-plane value using the Y-strides calculated
    4.2. Get U/V-plane value using the U/V-strides calculated
    4.3. Normalize Y-value into [0-255] range
    4.4. Center the U/V values around 128 to normalize them into [0-255] range
    4.5. Calculate the RGB values from the normalized YUV values
    4.6. Store the converted RGB values into a ARGB byte array
5. Convert the byte array to BitMap
google-ml-butler[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 1 year ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

MasterHansCoding commented 1 year ago

It's working 😭

Here's the part of my code, it is inspired from this article https://blog.minhazav.dev/how-to-convert-yuv-420-sp-android.media.Image-to-Bitmap-or-jpeg/


        // get buffers
        ByteBuffer bufferY = cameraImage.getPlanes()[0].getBuffer();
        ByteBuffer bufferU = cameraImage.getPlanes()[1].getBuffer();
        ByteBuffer bufferV = cameraImage.getPlanes()[2].getBuffer();

        // ARGB array needed by Bitmap static factory method I use below.
        int width = cameraImage.getWidth();
        int height = cameraImage.getHeight();
        int[] argbArray = new int[width * height];

        int pixelStrideY = cameraImage.getPlanes()[0].getPixelStride();
        int pixelStrideU = cameraImage.getPlanes()[1].getPixelStride();
        int pixelStrideV = cameraImage.getPlanes()[2].getPixelStride();
        int rowStrideY = cameraImage.getPlanes()[0].getRowStride();
        int rowStrideU = cameraImage.getPlanes()[1].getRowStride();
        int rowStrideV = cameraImage.getPlanes()[2].getRowStride();

        int r, g, b;
        int yValue, uValue, vValue;

        for (int y = 0; y < height; ++y) {
          for (int x = 0; x < width; ++x) {
            int yIndex = (y * rowStrideY) + (x * pixelStrideY);
            // Y plane should have positive values belonging to [0...255]
            yValue = (bufferY.get(yIndex) & 0xff);

            int uvx = x / 2;
            int uvy = y / 2;
            // U/V Values are subsampled i.e. each pixel in U/V chanel in a
            // YUV_420 image act as chroma value for 4 neighbouring pixels
            int uvIndex = (uvy * rowStrideU) +  (uvx * pixelStrideU);

            // U/V values ideally fall under [-0.5, 0.5] range. To fit them into
            // [0, 255] range they are scaled up and centered to 128.
            // Operation below brings U/V values to [-128, 127].
            uValue = (bufferU.get(uvIndex) & 0xff) - 128;
            vValue = (bufferV.get(uvIndex) & 0xff) - 128;

            // Compute RGB values per formula above.
            r = (int) (yValue + 1.370705f * vValue);
            g = (int) (yValue - (0.698001f * vValue) - (0.337633f * uValue));
            b = (int) (yValue + 1.732446f * uValue);
            r = clamp(r, 0, 255);
            g = clamp(g, 0, 255);
            b = clamp(b, 0, 255);

            // Use 255 for alpha value, no transparency. ARGB values are
            // positioned in each byte of a single 4 byte integer
            // [AAAAAAAARRRRRRRRGGGGGGGGBBBBBBBB]
            int argbIndex = y * width + x;
            argbArray[argbIndex]
                    = (255 << 24) | (r & 255) << 16 | (g & 255) << 8 | (b & 255);
          }
        }
        Bitmap bitmap = Bitmap.createBitmap(argbArray, width, height, Bitmap.Config.ARGB_8888);

        try {
          hands.send(bitmap);
          //textView.setText("it is working");
        } catch(Error e) {
          textView.setText("ici " + e.toString());
        }

Thanks for the recomandations ayushgdev.

Here's my part for the mediapipe project 😃

PS: This process is quite slow. To make it faster the article proposes using RenderScript, but since it is not supported by android anymore, "Vulkan" should be used.

This issue can be closed.

arianaa30 commented 6 months ago

I want to use this function but as you said it is very slow. But how can we rewrite it without using renderscript, but as fast?