microsoft / Azure-Kinect-Sensor-SDK

A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
https://Azure.com/Kinect
MIT License
1.5k stars 619 forks source link

Queue Size for Capture #1639

Closed jcova1996 closed 3 years ago

jcova1996 commented 3 years ago

Hi,

I am using the latest version of the SDK and coding in C#. I was wondering if there's a way to find out what the size of the queue is for the depth and color captures.

I have SynchronizedImagesOnly set to true, and I use a for loop to capture X amount of frames with the GetCapture function. When I run the devices (I have 9 plugged in and they each call this for loop within a Parallel.For) at 5 FPS and try to get 6 frames, the for loop takes less than 1ms. If I try to get 12 frames, then the for loop runs at > 850ms. I'm assuming that this is due to the queue size being less than 12 but greater than 6, but it would be useful to know what this number is, or how it is calculated from other parameters. If the dramatic difference between a 6 frame capture and a 12 frame capture isn't the queue size, what could cause such a difference?

Thanks!

diablodale commented 3 years ago

I recommend you search/review the code in this git repo? It is open source and the color capture pathway is all open source. The depth has a small component that is closed source.

For example, type "queue size" into the search field at the top and find your answers like https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/2feb3425259bf803749065bb6d628c6c180f8e77/include/k4ainternal/queue.h#L32

jcova1996 commented 3 years ago

@diablodale Yes, I have reviewed that code already. Unless my math was wrong, with that code the 5FPS should end up with a queue size of 2. Having a queue size of 2 makes me more confused as to why I would get 6 frames so fast when 6 is already greater than 2...

diablodale commented 3 years ago

I code in C/++ but I can likely follow your C# code. If you share the code in your test case, it can help. For example, it is unclear how you are calling the 9 sensors (suspicious itself) and doing so in serial/parallel. And if you are doing that correctly.

In general, the queues should be independent for each sensor. So you should be creating nine individual sensor object chains. And then in each of the 9 calling for captures, and then calling for the color frame of those captures. And all of that with some type of timeout which you set on your calls. As you can see, there is a lot of variability and without your code I (personally) can't help much more.

Last, you never say there is a problem. Are you just curious. Or do you have a specific test case with clear repro steps that indicate a problem?

jcova1996 commented 3 years ago

I don't think this is a problem. I just don't understand why there is such a big difference in retrieving the captures. The fact that 6 frames can be captured on each device in less than 1ms tells me the queue size is not 2 for 5 FPS. I would suspect it to be somewhere between 6 and 12 since the capture time of 12 captures is ~850ms.

It'd be nice to know what to queue size is just to have a better understanding about how getting captures works.

As for the code...

I initialize each Kinect with:

var kinectCount = Device.GetInstalledCount();
var devices = new IDevice[numDevices];
for (var i = 0; i < numDevices; i++)
{
    devices[i] = new Kinect(configuration, i);
}
internal Kinect(BodyScannerConfiguration configuration, int index = 0)
        {
            try
            {
                m_device = Device.Open(index);
            }
            catch
            {
                throw new AzureKinectOpenDeviceException($"Failed to open device {index}");
            }

            DeviceIndex = index;
            CurrentConfiguration = configuration;
            CameraPosition = BodyScanUtil.CameraPositions[DeviceIndex];
            m_deviceConfiguration = GeneralDeviceConfiguration;
            if (!(CurrentConfiguration is BodyScannerConfiguration.CalibrationScan))
            {
                if (_CameraPositionToDaisyChainIndex[CameraPosition] is 0)
                {
                    m_deviceConfiguration.WiredSyncMode = WiredSyncMode.Master;
                }
                else
                {
                    m_deviceConfiguration.WiredSyncMode = WiredSyncMode.Subordinate;
                    m_deviceConfiguration.SuboridinateDelayOffMaster =
                        new TimeSpan(1600 * _CameraPositionToDaisyChainIndex[CameraPosition]);
                }
            }

            m_transformation = m_device
                               .GetCalibration(m_deviceConfiguration.DepthMode, m_deviceConfiguration.ColorResolution)
                               .CreateTransformation();
        }
internal static DeviceConfiguration GeneralDeviceConfiguration =>
            new DeviceConfiguration
            {
                DepthMode = DepthMode.NFOV_Unbinned,
                CameraFPS = FPS.FPS5,
                ColorResolution = ColorResolution.R2160p,
                ColorFormat = ImageFormat.ColorBGRA32,
                SynchronizedImagesOnly = true,
                DepthDelayOffColor = TimeSpan.Zero,
                WiredSyncMode = WiredSyncMode.Standalone
            };

Then I start the streams with:

public void StartStreaming()
        {
            //Need to start front last since it is the master camera
            for (int i = 0, deviceCount = Devices.Length; i < deviceCount; i++)
            {
                if (i == CamPosToDevInd(CameraPosition.Front))
                {
                    continue;
                }

                Devices[i].StartDevice();
                DevicesRunning[i] = true;
            }

            Devices[CamPosToDevInd(CameraPosition.Front)].StartDevice();
            DevicesRunning[CamPosToDevInd(CameraPosition.Front)] = true;
        }
public void StartDevice()
        {
            m_device.SetColorControl(ColorControlCommand.PowerlineFrequency, ColorControlMode.Manual, 2);
            m_device.SetColorControl(ColorControlCommand.ExposureTimeAbsolute, ColorControlMode.Manual, 8330);
            m_device.SetColorControl(ColorControlCommand.Whitebalance, ColorControlMode.Auto, 6050);
            m_device.StartCameras(m_deviceConfiguration);
            if (CurrentConfiguration is BodyScannerConfiguration.CalibrationScan)
            {
                m_device.StartImu();
            }
        }

Then I capture the scans with:

for (var i = 0; i < deviceCount; i++)
{
    Devices[i].Scan();
}
public void Scan()
        {
            var captures = new Capture[FRAMES_AVERAGED];
            var timer = Stopwatch.StartNew();
            for (var i = 0; i < FRAMES_AVERAGED; i++)
            {
                captures[i] = m_device.GetCapture(TimeSpan.FromSeconds(1));
                if (!(captures[i] is null))
                {
                    continue;
                }

                Console.Beep(400, 500);
                Console.WriteLine("Failed to get Capture");
                return;
            }

            timer.Stop();

            Console.WriteLine(
                $"Captured {FRAMES_AVERAGED} frames in {timer.Elapsed.TotalMilliseconds} ms! Processing data for camera{DeviceIndex}");
            m_captures.Add(captures);
        }
diablodale commented 3 years ago

You can fully understand how getting captures works by reading the whole corpus of the open source code in this repo.

Since there isn't a problem you are trying to fix or isolate, I don't want to expend much time in your inquiry. It's my owntime management thing. 🙂 Perhaps msft or someone else has more time available.

The # of queue slots already populated before your first getcapture() will depend on when you make that first getcapture() relative to startup on that specific device. This is an unknown in your scenario and will different based on the relative startup of the 9 sensors in the chain and their relative startup. However...we can imagine that all nine sensors are running for many seconds, are never in conflict with exposures and IR strobes, and never have bad captures....

Doing a quick walk of the SDK's code from that code link I put above, I think the SDK's codepath for sync'd images gets into capturesync_create(). In there three queues are created and none of them depend on a chosen FPS. captures have a queue size of QUEUE_DEFAULT_SIZE / 2 color and depth have queue size of QUEUE_DEFAULT_SIZE https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/04168adc949c44180c51d90a8b8d34807891ec85/src/capturesync/capturesync.c#L417-L430

Separately, the depth engine (that at a low level populates the depth and ir data) has a fixed depth of 2. I will ignore that for now.

QUEUE_DEFAULT_SIZE is a fixed value of...

QUEUE_CALC_DEPTH(k4a_convert_fps_to_uint(K4A_FRAMES_PER_SECOND_30), QUEUE_DEFAULT_DEPTH_USEC)
QUEUE_CALC_DEPTH(30, QUEUE_DEFAULT_DEPTH_USEC)
QUEUE_CALC_DEPTH(30, 500000)
(500,000) / (1,000,000 / (30))
15

So that suggests to me that the captures have a queue of 15 / 2 == 7 and color&depth have a queue of 15