Open ksze opened 8 years ago
you are right. there is no synchronization guarantee for SyncMultiFrameListener, as we don't (didn't) know how to synchronize the frames.
the "sync" in the class name also refers to the fact that you can receive frames in the calling thread synchronously by calling waitForNewFrame(), and not to any synchronization between color and depth. if you just register a framelistener it will be executed asynchronously on the packet processor's thread. admittedly, this fact might got lost in the documentation process and is not obvious for a library user (@xlz improve documentation!?).
afaik there is no synchronization between the triggers of the color and depth camera, so color and depth frame will always be taken at different times (on which of the 9 raw ir images would you synchronize?). maybe the MultiSourceFrameReader just outputs color/depth frames with the least difference in timestamp and applies a threshold on the maximum difference? maybe there is also another sequence number hidden in the packet data?
however, it is probably a good idea to add a framelistener, which does some kind of temporal synchronization.
A bit more investigation reveals something interesting.
Using another PC of the exact same spec, but running Windows 8.1 Professional 64-bit and using the official SDK, with MultiSourceFrameReader.
The RelativeTime
of the ColorFrame
seems to always lags 6.25 or 6.375 ms behind the RelativeTime
of the DepthFrame
, InfraredFrame
, BodyFrame
, BodyIndexFrame
. Meanwhile, the RelativeTime
always matches among DepthFrame
, InfraredFrame
, BodyFrame
, and BodyIndexFrame
.
I wonder where the official SDK gets that RelativeTime
. For one thing, the RelativeTime
has 0.1 microsecond resolution, unlike the 0.1 millisecond resolution that libfreenect2 gets from the USB packets. I suspect that the RelativeTime
is actually calculated on the PC, not on the sensor.
If we want to implement a listener that is really synchronized, I think we need to understand the change pattern of the packets' timestamp. As a uint32_t
, it necessarily overflows every 5 days (2^32 ticks / 10000 ticks-per-second / 60 seconds-per-minute / 60 minutes-per-hour / 24 hours-per-day ~= 4.97 days). However, the value may actually get reset much earlier. There is also a non-zero chance that the behaviour changes from one firmware version to the next - although I think that would be quite stupid and I don't think Microsoft would do that.
Has anybody run tests to check how often the timestamp gets reset?
Hello everyone :),
Sorry to dig out this topic, but I have a question regarding synchronization too, this time between IR and depth frames. I know they derive from the same sensor, so they are aligned in terms of position (as they say in the MSDN documentation, "Note that the infrared frame is derived from the same sensor as depth, so the images are perfectly aligned. For example, the infrared pixel at row 5 col 9 goes with the depth pixel at row 5 col 9").
As it is said above too, "The RelativeTime of the ColorFrame seems to always lags 6.25 or 6.375 ms behind the RelativeTime of the DepthFrame, InfraredFrame, BodyFrame, BodyIndexFrame. Meanwhile, the RelativeTime always matches among DepthFrame, InfraredFrame, BodyFrame, and BodyIndexFrame."
Do we have the same result with libfreenect2? I'm probably going to use this library instead of the Microsoft SDK because of multiplatform/OS compatibility issues. I'm looking all over the Internet but can't seem to find an answer, maybe because it is too obvious. But I'm still asking, just to make sure.
I'm definitely a noob to Kinect development so sorry if my question is stupid... but I thought it could be possible that for some reason the computation of the depth map could last longer than 1/30 second, so if we still have a sensor firing images at 30 FPS we could find ourselves with one data stream lagging behind the other. (Come to think about it, is the depth frame calculated somehow or is it just raw data from the time of flight sensor?)
Thank you in advance for your answer!
@LiquidStorm
Note that the infrared frame is derived from the same sensor as depth, so the images are perfectly aligned
We do have this guarantee. If one frame takes longer than 1/30s to compute there would be data loss instead of time delay between depth/IR. Imagine this: depth and IR are two sides of one coin, they are computed simultaneously.
RelativeTime of the ColorFrame seems to always lags 6.25 or 6.375 ms
I don't think it's even correct for the official docs to say this. The color camera has auto exposure so it's timing is always variable. There is no guarantee on a hardware level, so no for the MS SDK and no for ours. And there is also no guarantee on the USB level because the color frames are transmitted without realtime guarantee.
If you want just a comparison of MS SDK with this, I can say our library provide the same performance with better compatibility.
Action items
While investigating the behaviour of device timestamps and sequence numbers, we need to bear in mind that under low light conditions:
In that case, it may be necessary to also look at other metadata on the RGB frame (e.g. exposure time and gamma) to decide how to associate with depth frames.
In particular, I'm worried about the following sequence of events:
If this happens, is colour frame 1 supposed to be associated with depth frame 1 or with depth frame 2?
I feel like it's not our job to decide whether to drop data just because of intricate detail in timing mechanism. This is policy and policy should be controlled by the user. If the policy is implemented in this library it necessitates a large buffer to examine temporal alignment which increases the delay before the user can get the data.
If the user cares enough about time correctness they should create two listeners for depth and rgb separately and decide whether it's aligned or not.
The things that are needed in this library are better timestamping and info about latency.
Is there any more news about the timestamps in the kinect? I am wondering whether or not the RelativeTime in the official SDK is measured on device or in the computer.
(or both)
Here is what I'm seeing from the Windows SDK. I record the RelativeTime from each frame. I calculate delta time by subtracting the last frame from this frame.
I allocate an array with the last 150 of these delta time and perform mean/stddev. I then watch this window frame of 150 through time.
Using just recording IR and RelativeTime from the IR frame.
The mean is stable around 33333x. The stddev is around 4000. These units are in kinect RelativeTime units, so 10000 ticks in a millisecond, 1000 milliseconds in a second.
So, I believe, I am seeing that the kinect relative time stamp of the IR sensor varies from an absolute 1/30 of a second by 400 microseconds.
This variance is independent of possible processing going on. (The kinect, with the microsoft SDK, slows down and speeds up with ir frame delivery depending on the nearness of objects in a scene)
--
Recording just Color and using the RelativeTime from the Color Frame.
The color camera is less stable. It ranges from 333200 to 333333. stddev ranges from 5000 - 6200.
The color camera does switch to 666666 when in low light.
Do you have any information about the timestamps in the packets? I wonder if they follow a similar pattern?
In the comments above, it says there are 9 raw ir packets. Does this mean for each 1/30 of a second, there are 9 IR frames delivered to the client?
Does libfreenect use all 9 to create a depth frame?
Do you get a timestamp with each sub frame?
Do you think it might be possible to use 4 sub ir frames from one packet and 5 from another? I am looking to do very precise synchronization between multiple kinects.
Thanks in advance for any answers to my never ending question list.
I have no idea what your measurements mean on a first look.
Do you have any information about the timestamps in the packets
https://github.com/OpenKinect/libfreenect2/issues/849
Does this mean for each 1/30 of a second, there are 9 IR frames delivered to the client?
https://github.com/OpenKinect/libfreenect2/wiki/Linux-USB-Notes#design-of-alternative-transfer-pool
Does libfreenect use all 9 to create a depth frame?
True.
Do you get a timestamp with each sub frame?
Unnecessary. Isochronous transfers are isochronous.
Do you think it might be possible to use 4 sub ir frames from one packet and 5 from another?
No.
I added code for timestamps. I am seeing interesting and good data.
The depth received timestamp is the timestamp of the first whole usb packet received. the k-timestamp is the timestamp which comes with the packet.
At first glance, the delta "received_timestamp" is aligning nicely with the delta kinect-timestamp. Which is great.
depth received timestamp 1497012622805607 dt 64004 k-timestamp 56060 k-dt 533 depth received timestamp 1497012622853943 dt 48336 k-timestamp 56327 k-dt 267 depth received timestamp 1497012622885779 dt 31836 k-timestamp 56594 k-dt 267 depth received timestamp 1497012622949639 dt 63860 k-timestamp 57127 k-dt 533 depth received timestamp 1497012622981634 dt 31995 k-timestamp 57394 k-dt 267 depth received timestamp 1497012623013618 dt 31984 k-timestamp 57660 k-dt 266 depth received timestamp 1497012623045644 dt 32026 k-timestamp 57927 k-dt 267 depth received timestamp 1497012623077612 dt 31968 k-timestamp 58194 k-dt 267 depth received timestamp 1497012623109594 dt 31982 k-timestamp 58460 k-dt 266
I will attach the diff, used to create these time stamps. Things which I will probably change for myself are changing the clock to a std chrono high resolution clock.
If you look over diffs and notice any obvious problems with how I am doing this, please let me know. However, at this point, I believe it is working. packet_timestamps.txt
Overview Description:
Is there any kind of guarantee about the synchronization among the depth/ir/colour frames when using
SyncMultiFrameListener
?I looked at the implementation of
bool SyncMultiFrameListener::onNewFrame(Frame::Type type, Frame *frame)
and there seems to be no guarantee.The only thing that the listener seems to guarantee is that all subscribed frame types are present. It doesn't seem to care if one frame type is lagging behind another.
For instance, I modified Protonect so it prints out the
rgb->timestamp - depth->timestamp
on every frame, and I find that the difference is always 179 or 180 so far, which means 17.9 or 18.0 ms. This is a bit worrying because at 30 fps, the time interval of each frame is only 33.3 ms; 17.9 ms is more than half of that. Do we know how the Kinect device sequences/muxes depth and colour frames? Could we be collecting the depth and colour frames in the wrong order? If I remember correctly, the official Kinect for Windows v2 SDK seems to have some synchronization guarantee when using it's MultiSourceFrameReader. If a depth or colour frame is dropped, the MultiSourceFrameReference will leave it out instead of giving you the last available frame, although I don't know how the official SDK associates the colour frame with the correct depth frame.I have also looked at the sequence numbers of the frames. There seems to be always a difference between the depth packet's sequence number and the colour packet's sequence number, but the difference is a bit random between initializations of the Kinect device, so I don't think we can reliably use it to associate depth frames with colour frames.
You can also imagine the following sequence of packets, where:
C
= received colour packetc
= skipped colour packetD
= received depth/ir packetd
= skipped depth/ir packetDCdCDCDcDCDcdcdcdcdcdC
With the current libfreenect2 implementation, it seems like the
CDCD
after the initialDCd
would be treated as synchronized frames even though the depth frame is actually lagging behind the colour frame. The mis-ordering could go on for a very long time until ac
frame is skipped. The sequence ofDcdcdcdcdcdC
at the end is also bad because theD
and theC
are in fact separated by many droppedc
andd
frames, but they would still be treated as synchronized frames.Version, Platform, and Hardware Bug Found:
git log -1 --oneline
uname -a
lsusb -t
lspci -nn