microsoft / Azure-Kinect-Sensor-SDK

A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
https://Azure.com/Kinect
MIT License
1.49k stars 620 forks source link

Kinect Eventing API(s) #175

Open wes-b opened 5 years ago

wes-b commented 5 years ago

In 1.1.x we want to add event based API’s to notify the user when devices are connected or disconnected from the PC. The main events to cover are USB device attach and USB device detach. Other events that might be interesting in future iterations might be: the loss of just the color camera, loss of the depth camera, device reset, high water mark on capture queue, to name a few. It would be nice have a single event API that can grow in the future if we want it to.

There is also a desire for the API to work without requiring a handle be opened to any device. Meaning k4a_device_t should not be needed to call this API.

Option 1

This options follows the SDK’s existing style of not having any callback functions. Just like we have k4a_device_get_capture/imu_sample we will add:

K4A_EXPORT k4a_wait_result_t k4a_get_next_event([out][optional] uint32_t *device_index,
                                                [out][optional] char *serial_number,
                                                [out][optional] size_t serial_number_size,
                                                [out] k4a_event_t *event,
                                                [in] int32_t timeout_in_ms);

Device_index, serial_number, and serial_number_size are optional parameters, if the user doesn’t care about them, then they can pass in NULL for those values. The user will have to call into this blocking API to be notified when a device is attached or detached from the PC. I see two calling patterns with this: 1) The user may call this API from a dedicated thread and specify the timeout as K4A_WAIT_INFINITE. 2) The user may poll for updates with a timeout of 0 and do it on the same thread that is waiting for captures. The word ‘next’ is added to the name to imply that we are not keeping history here. So when you call it the first time you can expect it only return events from that moment forward. There needs to be bit of state caching due to the fact an event can happen after k4a_get_next_event has returned and user is currently processing an event. The storing of this state information should be minimal. In the case of using K4A_WAIT_INFINITE a 2nd API is needed to cancel the blocking call when the thread needs to tear down.

K4A_EXPORT void k4a_cancel_get_event(void);

This then drives the k4a_event_t enum to have 3 values:

typedef enum
{
    k4a_event_device_attached = 0, /**< Azure Kinect device has been attached to the PC */
    k4a_event_device_detached,     /**< Azure Kinect device has been detached from the PC */
    k4a_event_wait_canceled,       /**< k4a_get_event has been cancelled by the user */
} k4a_event_t;

When k4a_cancel_get_event() is called, it will unblock k4a_get_next_event() and return k4a_event_wait_canceled. If the call was blocking then its returned immediately. If the k4a_get_next_event was not actively blocking, then the next call to the API will return k4a_event_wait_canceled. Drawbacks:

Option 2

This option is to use a traditional callback based API. We will need 3 API’s: 1) The k4a API that registers the callback function. 2) The k4a API that unregisters the callback function. 3) The users callback function the events will be delivered to. With this implementation events are delivered asynchronously to the user. Multithreaded protection here is the most difficult aspect to get right. And this is usually where bug come from and are the hardest to fix. We also have to decide when the unregistration API can be called.

Summary

The data the user needs is the same in both scenario’s; SN, index, and attach/detach state. The serial number is a string and a copy of it has to be shared with the user in the callback case, so owner ship of that can get tricky. In option 1 the serial number is just copied to users buffer, so ownership of that buffer is never in question. Option 1 is also consistent with the existing API’s. And the option 1 behavior quirk can be addressed with documentation, such as saying “k4a_get_next_event() will return success and an event of xxx_wait_canceled exactly 1 time during or after a call from k4a_cancel_get_next_event().”

schultetwin1 commented 5 years ago

I like option 1 in terms of blocking because its consistent with our other APIs. A couple thoughts though:

First, in the description we talk about a generic eventing API.

It would be nice have a single event API that can grow in the future if we want it to.

However, the parameters for k4a_get_next_event() in option 1 are focused on device enumeration

[out][optional] char *serial_number,
[out][optional] size_t serial_number_size,

If we wanted different events to have different metadata, we'd have to change these parameters. Maybe we should just say that events are just k4a_event_t and to get serial numbers you have to call query devices for our API? That also doesn't sound ideal. Maybe we can make k4a_event_t and enum and a union of all the different data? We'd just have to be careful about backwards compat if the alignment of that union changes.

A second thought about option 1. Do we open ourselves up to a race condition?

  1. User calls k4a_get_device_count()
  2. device count is 0
  3. A new device is attached
  4. User calls k4a_get_next_event() expecting to be notified when a new device arrives.

Since k4a_get_next_event() only works from here on out, we could miss the arrival and hang forever?

wes-b commented 5 years ago

@schultetwin1 I think no matter what the event is, we will need to communicate what device the event is for. Serial number / index is how we do that. Even if we had a high water mark event, we would need to communicate with it which device is triggering the event. If we wanted to distinguish between synchronized captures and IMU then we may define different events that map to a single device.

As for your concern about the race conditions, the order would be more like: 1) User calls k4a_get_next_event(timeout = 0) 2) User calls k4a_get_device_count() 3) Device count is 0 4) A new device with SN 1 is attached 5) User calls k4a_get_next_event() and a device with SN 1 is reported as attached.

We will need to provide a sample of how to ensure all attached devices are enumerated over because that is not exactly obvious and the index may change based on attached devices.

schultetwin1 commented 5 years ago

Ah, good point about which device it is. That makes sense.

In terms of the changing of the order of the race condition I think there is still a race due to this statement in the original proposal:

So when you call it [k4a_get_next_event()] the first time you can expect it only return events from that moment forward

In step 5, the "event" of a device being attached has already passed so it wouldn't be returned.

xthexder commented 5 years ago

I think if we're already opening ourselves up to some callback api's (https://github.com/Microsoft/Azure-Kinect-Sensor-SDK/issues/169) then I think it makes the most sense for an Event API to be callback-based.

That said, a polling API like you proposed is also fairly common, and matches the code flow of get_capture(), get_imu_sample(), etc...

In either option, I think the biggest thing we need to solve is how to pass different data in for different events.

In option 1, the flow around serial_number_size in particular is also somewhat confusing. If you specify NULL size, but still pass in a serial_number buffer, or if your size is too small, we can't remove the event from the queue. This makes the behavior of get_next_event a lot more complex both to implement and document.

In option 2, the serial_number buffer would need to be owned by us, but it would be fairly easy to document that the buffer is valid until the callback returns. The user can make a copy of it if they need it for a longer duration.

Brent-A commented 5 years ago

I'll throw in a hybrid Option 3 for the threading model. Still use callbacks, but let the caller own the thread.

Example API:

typedef enum
{
    k4a_event_device_attached = 0, /**< Azure Kinect device has been attached to the PC */
    k4a_event_device_detached,     /**< Azure Kinect device has been detached from the PC */
} k4a_event_device_t;

typedef void (K4A_DEVICE_WATCHER_CALLBACK)(const char* serial_number, k4a_event_device_t event);

k4a_device_watcher_t k4a_device_watcher_create(K4A_DEVICE_WATCHER_CALLBACK* callback);
void k4a_device_watcher_thread_proc(k4a_device_watcher_t watcher_handle);
void k4a_device_watcher_destroy(k4a_device_watcher_t watcher_handle);

Example usage:


k4a_device_watcher_t g_Watcher = NULL;;

void DeviceEvent(const char* serial_number, event)
{
    printf("Device %s %s\n", serial_number, event == k4a_event_device_attached ? "attached" : "detached");

    if (event == k4a_event_device_attached)
    {
          // Open k4a_device_t
          // Read stream in a loop and process data
          // Close k4a_device_t
    }
}

int MyThread()
{
   // This function returns when g_Watcher is destroyed
   k4a_device_watcher_thread_proc(g_Watcher);
   return 0;
}

void Main()
{
   g_Watcher = k4a_device_watcher_create(K4A_DEVICE_WATCHER_CALLBACK* callback)
   Thread watcherthread = Thread(MyThread);

   // Do stuff 

   // Calling this will cause watcherthread  to terminate
   k4a_device_watcher_destroy(g_Watcher);
}
DerekMa-WP commented 5 years ago

This would be useful to use in the Firmware Tool to be able to remove the sleep and polling.

wes-b commented 5 years ago

Windows version of LIBUSB doesn't support hotplug events. https://github.com/libusb/libusb/issues/86

Linux version requires a thread poll libusb_handle_events_timeout_completed (or equivalent) to trigger the events. I am more seriously considering a callback based design based on this. (Both Win32 device listener and Linux would both have callback functions to deliver arrival and removal events on.

wes-b commented 5 years ago

Testing of this feature on all platforms makes this issue particularly large.

neilDGD commented 4 years ago

+1 for this to be implemented; quite awkward to have to reset our app to enumerate new devices, for our users and for our testers :) Which ever impl works best.