etiennedub / pyk4a

Python 3 wrapper for Azure-Kinect-Sensor-SDK
MIT License
290 stars 81 forks source link

capture_set_color_image with K4A_IMAGE_FORMAT_COLOR_MJPG #212

Open Westerby opened 1 year ago

Westerby commented 1 year ago

Hello,

I am working on a case that requires some modifications on color kinect frames. What I want to do is:

What I did so far includes adding a setter method in capture.py:

@color.setter
def color(self, color_image: np.ndarray):
    if self._color is not None:
        del self._color

    k4a_module.capture_set_color_image(self._capture_handle,
                                       self.thread_safe,
                                       color_image,
                                       self._color_timestamp_usec,
                                       self._color_system_timestamp_nsec)
    self._color = color_image

Implementation of capture_set_color_image in pyk4a.cpp:

static PyObject *capture_set_color_image(PyObject* self, PyObject* args)
{   
    k4a_capture_t *capture_handle;
    PyObject *capsule;
    int thread_safe;
    uint64_t device_timestamp_usec = 0;
    uint64_t system_timestamp_nsec = 0;
    PyThreadState *thread_state;
    PyArrayObject* in_array;
    k4a_result_t res = K4A_RESULT_FAILED;

    PyArg_ParseTuple(args, "OpO!ii", &capsule, &thread_safe, &PyArray_Type, &in_array, &device_timestamp_usec, &system_timestamp_nsec);
    capture_handle = (k4a_capture_t *)PyCapsule_GetPointer(capsule, CAPSULE_CAPTURE_NAME);

    k4a_image_t img_dst;
    res = numpy_to_k4a_image(in_array, &img_dst, K4A_IMAGE_FORMAT_COLOR_BGRA32);
    thread_state = _gil_release(thread_safe);

    if (K4A_RESULT_SUCCEEDED == res) {
      k4a_image_set_device_timestamp_usec(img_dst, device_timestamp_usec);
      k4a_image_set_system_timestamp_nsec(img_dst, system_timestamp_nsec);
    }
    // Set the color image on the capture object
    k4a_capture_set_color_image(*capture_handle, img_dst);
    // Release the image handle
    k4a_image_release(img_dst);

    return Py_BuildValue("");
}

The working example can be something like this:

import cv2
from pyk4a import PyK4APlayback, PyK4ARecord, PyK4A
from pyk4a.config import Config

filepath_rgba = "<rgba_mkv_file_path>"
os.path.isfile(filepath_rgba)

playback = PyK4APlayback(path=filepath_rgba)
playback.open()

device = PyK4A(config=playback.configuration, device_id=1)
cfg = Config(color_resolution=playback.configuration["color_resolution"],
       color_format=playback.configuration["color_format"],
       depth_mode=playback.configuration["depth_mode"],
       camera_fps=playback.configuration["camera_fps"],
       synchronized_images_only=True,
       wired_sync_mode=playback.configuration["wired_sync_mode"],
       subordinate_delay_off_master_usec=playback.configuration["subordinate_delay_off_master_usec"],
       )

rec = PyK4ARecord(path="<otp_modified_mkv_path>", device=device, config=cfg)
rec.create()

while True:
    try:
        capture = playback.get_next_capture()
        if capture.color is None:
            continue
        array_color = capture.color
        cv2.rectangle(array_color, (0,0), (200,200), (0,0,255), -1)
        capture.color = array_color
        rec.write_capture(capture)
    except Exception as e:
        print(e)
        break

rec.flush()
rec.close()

The code works with K4A_IMAGE_FORMAT_COLOR_BGRA32 images, minus some minor issues (it doesn't write Device information to the recording).

I'd like to extend the functionality to modify K4A_IMAGE_FORMAT_COLOR_MJPG format as well, but I can't get numpy_to_k4a_image to work with MJPG data. I set

pixel_size = (int)sizeof(uint8_t);
stride = 0;

and pass those to k4a_image_create_from_buffer, but get error:

[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (381): Invalid argument to image_inc_ref(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (363): Invalid argument to image_dec_ref(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (381): Invalid argument to image_inc_ref(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (396): Invalid argument to image_get_size(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (389): Invalid argument to image_get_buffer(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.889] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (363): Invalid argument to image_dec_ref(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.890] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (51): k4a_image_t_get_context(). Invalid k4a_image_t 0000006621FEF310
[2023-05-24 08:23:12.890] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (363): Invalid argument to image_dec_ref(). image_handle (0000006621FEF310) is not a valid handle of type k4a_image_t
[2023-05-24 08:23:12.890] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (66): Invalid argument to image_create_from_buffer(). height_pixels <= 0 || height_pixels > 20000
[2023-05-24 08:23:12.890] [error] [t=14424] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\image\image.c (66): image_create_from_buffer() returned failure.

I can't find anything more in the documentation, and no examples for such usage. Would you be able to help me on this one?

rajkundu commented 1 year ago

I'm not sure if this helps, but perhaps you need to decode the compressed MJPG data into BGRA32 format first, e.g., using array_color = cv2.cvtColor(cv2.imdecode(capture.color, cv2.IMREAD_COLOR), cv2.COLOR_BGR2BGRA)? I ran into similar issues with MJPG vs. BGRA32, so perhaps this comment/issue could help: https://github.com/etiennedub/pyk4a/issues/164#issuecomment-1234687511

Westerby commented 1 year ago

I got it to work by modifying the numpy_to_k4a_image to support K4A_IMAGE_FORMAT_COLOR_MJPG datatype. For my usage width_pixels and height_pixels parameters are just hardcoded there for MJPG data in the snippet below (requires refactor).

k4a_result_t numpy_to_k4a_image(PyArrayObject *img_src, k4a_image_t *img_dst, k4a_image_format_t format) {

  int width_pixels = img_src->dimensions[1];
  int height_pixels = img_src->dimensions[0];
  int pixel_size;
  int stride_bytes; 
  int buffer_size; 

  switch (format) {
  case K4A_IMAGE_FORMAT_DEPTH16:
  case K4A_IMAGE_FORMAT_CUSTOM16:
  case K4A_IMAGE_FORMAT_IR16:
    pixel_size = (int)sizeof(uint16_t);
    stride_bytes = width_pixels * pixel_size;
    buffer_size = width_pixels * height_pixels * pixel_size;
    break;
  case K4A_IMAGE_FORMAT_COLOR_BGRA32:
    pixel_size = (int)sizeof(uint32_t);
    stride_bytes = width_pixels * pixel_size;
    buffer_size = width_pixels * height_pixels * pixel_size;
    break;
  case K4A_IMAGE_FORMAT_CUSTOM8:
    pixel_size = (int)sizeof(uint8_t);
    stride_bytes = width_pixels * pixel_size;
    buffer_size = width_pixels * height_pixels * pixel_size;
    break;
  case K4A_IMAGE_FORMAT_COLOR_MJPG:
    pixel_size = (int)sizeof(uint8_t);
    stride_bytes = 0;
    buffer_size = width_pixels * height_pixels * pixel_size;
    width_pixels = 1920;
    height_pixels = 1080;

    break;
  default:
    // Not supported
    return K4A_RESULT_FAILED;
  }

  return k4a_image_create_from_buffer(format, width_pixels, height_pixels, stride_bytes,
                                      (uint8_t *)img_src->data, buffer_size, NULL, NULL,
                                      img_dst);
}

I think that the best way to pass them down would be to add another parameter to k4a_module.capture_set_color_image call in color.setter for color_resolution. That will require further change to numpy_to_k4a_image - as in case of K4A_IMAGE_FORMAT_COLOR_MJPG color resolution is not the same as img_src->dimensions.

Tested the setter for BGRA and MJPG data.

Would any of the authors be interested in checking if that's the right approach?