microsoft / Azure-Kinect-Sensor-SDK

A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
https://Azure.com/Kinect
MIT License
1.47k stars 613 forks source link

Body tracking is too slow and inaccurate #514

Closed rfilkov closed 4 years ago

rfilkov commented 4 years ago

Describe the bug

By all means Azure Kinect is the best Kinect so far, and will be probably the best depth sensor on the market. The sensor SDK is pretty stable and good, providing almost everything an average user would want. But the body tracking subsystem is ruining this positive user experience. In means of API this SDK is great too, but the DNN model performance is much worse than the body tracking of Kinect-v2. The joint positions are inaccurate by fast movements. The body index map is not very accurate, as well. It does not fully match the user's silhouette on the depth frame. On my GTX 1060 it takes 2-3 depth frame cycles to process a body frame. Hence, it works at about 10 fps.

To Reproduce

  1. Run Azure Kinect Body Tracking Viewer.
  2. Stand in front of the sensor.
  3. Make fast arm movements.
  4. Look at the arm joint positions with regard to the real arms.
  5. Look at the colorized body-index map with regard to the real body.

Expected behavior

  1. I expect the body tracking to work at least at 30 fps or more, i.e. faster than the depth frames arrive.
  2. I expect the body joint positions to match as precisely as possible the user's joints on the depth frame.
  3. I expect the body index map to match as precisely as possible the user's silhouette on the depth frame.
  4. I expect the body tracking to be less demanding, in means of hardware and 3rd party software requirements. GTX 1070 + CUDA + cuDNN + manually setting paths would be too much for the average user.

Please consider at least providing some option to the users, who don't have high end graphics cards and would like to get Kinect body tracking out of the box, without (or with minimum) extra installations. As far as I remember, Kinect-v2 used random forest model(s) for its body tracking. The performance was great and no extra installations were needed, back then in 2013/14.

Logs

Screenshots

Desktop (please complete the following information):

Additional context

I believe most Kinect users would expect better, more accurate and more performant body tracking experience, not worse. And now, with Apple adding people segmentation and body tracking to their AR-Kit 3.0 I would expect Kinect (with all these years of experience) to provide a better user experience in all aspects than anybody else.

cavisions commented 4 years ago

@qm13 Regarding your request - I really hope this message finds you well.

The goal is also to be more accurate. Please provide examples were we are not as accurate.

Please find several examples of very poor accuracy below. The video is taken with 6 years old boy 122cm tall and rather slim posture.

Problem #1 "Separated hands forward movement"

image VIDEO separated hands

Problem #2 "Joined hands movement"

image VIDEO joined hands

Problem #3 "Legs to the sides movement"

image VIDEO legs

The problems are less intensive when it comes to adults, but the accuracy while working with children at this age is unfortunately not good.

fractalfantasy commented 4 years ago

@qm13 thanks for addressing these issues - I've answered your questions and notes in the list below - a lot of the developer expectations on these have already been expressed so I've referenced a few:


1. FPS - 30FPS is fine given the sensor limitations, but it's expected to run at 30FPS on an Intel CPU like it's predecessor:

2. LATENCY - Current latency is a .25 seconds on a gaming rig - its expected that the Azure Kinect match or beat the latency of its predecessor (0.12s) and contemporaries (0.006s):

3. CROSS-PLATFORM - current compatibility is with Nvidia 1070 and up - expectation is for Azure Kinect body tracking to be run on any ~$1000 laptop made in the last 5 years like it's predecessor.

4. RESOURCE HOGGING GPU - To answer your question "What is too much GPU?" - Many Kinect developers use the sensor to generate visuals, so give us as much GPU as humanly possible. I'm pretty sure the Kinect v2 gave us 100% of the GPU, that would be perfect.

5. ACCURACY - I have to disagree with you here, it isn't widely agreed upon that the Azure Kinect DNN Body Tracking is more accurate than Kinect 2. So far the only case I've heard of it being more accurate is on static frames where a user is facing away from the camera, which is a pretty niche scenario. Accuracy has been reportedly much worse for dynamic movement than Kinect v2 and this accounts for most use cases of the Kinect.

6. POWER EFFICIENCY - To answer your statement about VSPs - industrial clients shouldn't be expected to have specialized/experimental AI processors on hand, they will likely be using a high volume of cheap, small and efficient computer systems.


TL-DR: We expect Azure Kinect Body Tracking to beat his dad at everything, but right now his dad (Kinect 2 Body Tracking) is able to kick his ass on pretty much all fronts. Azure Kinect Body Tracking is still young and going though puberty and figuring out how to talk to girls, and one day he'll grow to be stronger than his dad, but right now we need his dad to help us build these houses.

All jokes aside, In order to make great applications that will sell the Kinect, we need the strongest body tracking option available now - not what will grow to be strong in 4 years. All your improvements have been very small and incremental so far, and will likely continue at this pace (especially in the midst of Covid). So we really need a random forest option to start building apps, and to finally leave you alone to make the DNN model exactly how you envision it.

dotslinker commented 4 years ago

@qm13 Thanks a lot for your attention. I completely agree with @fractalfantasy.

I beg your pardon if I'll be a bit long, but I'd like to tell you a story.

Since 2015 I have been developing a C# app we are currently using in a pediatric rehabilitation lab with children. So far, we have been satisfied with Kinect v2 (and its SDK), included accuracy and fluidity (quite constant 30 fps on almost the PCs I tested it). Last January I went in Boston for the MIT xR Hackathon, where I tried to combine Kinect for Azure with Hololens 2 (ARehab team - https://vimeo.com/385983447). I was very much positive, since I found the Kv2 SDK very much easy and self-explanatory, and I thought the new version should have been the same or even better.

Unfortunately, it was not the case.

The "nightmare" started when I looked for the documentation, trying to figure out what to download in order to have all the proper tools up and running. Definitively it was not the same of the v2 SDK with its simple interface where you can easily select the feature you want to implement and in few minutes you were coding. Then I realized that also the source example was completely different, referring to some (undocumented) OpenGL functionalities that (on theory) required me to further foster that OpenGL world to fully handle the code. I was only in the need to grab the skeleton's joints positions and send them to the Hololens 2. I was using a DELL notebook equipped with an i9, 32 GB RAM and a GTX 1650 GPU, and I thought it was more than enough for the task.

Unfortunately, it was not the case.

It turned out that the demo C# source code, once compiled, ran on my laptop @ less than 25 fps (sometimes even much lower than 20 fps). The “footprint” of the whole binary folder was more than 600 MB vs my humble 16 MB of the size I was used to with my previous application (which included a lot of processing and charting algorithms too), with the two files for the DNN model and processing engine summing up the planetary size of around 500 MB...

We do not need a spaceship to travel 100 yards: we need something to use now and in the near future, since most of the children that were coming to the lab for rehab activities cannot move anymore to our center. Hence we are currently looking at home environment to continue to delivery our treatments. The same happens also for other type of patients, especially elderly, who very likely will be forced at home much longer than the others. At the moment we cannot ask all these people to upgrade their PC to a newer one (paying more than 1000 dollars) just to use the K4A. Especially if we consider that the previous one (Kv2) was running smoothly on a very broad range of computers.

Kv2 latency was ok, accuracy was enough for a “markerless frontal body tracking”, not only for gaming purposes but also for rehab activities. Kv2 source example were great as the documentation. From my point of view, the only two drawbacks of Kv2 are the cable connection (it was not a USB3 compliant connection), and… the current unavailability for purchase.

K4A appears to be a great piece of hardware, well designed and engineered. It is USB3 fully compliant (and this is great), and has many interesting features (like the tons of microphones). However, so far it seems too bulky to be largely used, mostly for the SDK, which seems to be still something for… hackers :-). We’d need just an extended version of the previous SDK to use the new Kinect. That will definitively pave the way to a universal adoption of the new device, which otherwise will be confined to research labs or technological enthusiasts.

Please consider the opportunity you have (as Kinect for Azure developing team) to greatly help in these very tough times many people who now are forced (and will be likely for a while) at home. And more generally speaking to boldly contribute in the development of a novel way to treat people at home instead of making them travelling for hours just to get to the center where they may be treated.

P.S. sorry for this post a bit OT and long, but I just wanted to present a real case in which the current performances of K4A are way too bad on most of the computers which instead could/should use it.

PierrePlantard commented 4 years ago

I'm totally agree with the previous comment. Accuracy during self occlusion, latency and hardware requirement limit the use of the Kinect for azure. I would nevertheless salute a notable improvement of the kinect for azure body tracking. The ability to estimate a person's posture from the back. This is a very important feature for us and it will have to be accessible in the light version that is planned.

vpenades commented 4 years ago

@qm13, Keep in mind many of us already have products leveraged against the Kinect2, and the new K4A body tracker drifts away too much from the Kinect2 in terms of features and performance to simply replace one by the other. In many cases it means a complete product redesign.

The mistake here has been to deliver the new DNN Body Tracking sdk BEFORE having a Kinect2 compatible SDK.

Over time, we could have gradually adapted our software, and let clients that can afford it use the new DNN body tracking.... without compromising existing clients.

So, regardless of what other trained models and technologies are provided. Kinect4A needs a trained model equivalent in features and performance to what the Kinect2 already delivered. This is what a lot of people in this forum has been asking all along.

Summing up:

  1. Resource hogging GPU - this is also a compute issue. We agree we need a lighter algorithm but (as pointed out by some maybe not light enough). The Direct ML path will also open the door to VSPs i.e. Myriad X and its next generation. What is too much GPU?

Too much GPU is anything more than what the Kinect2 required.

Consider the scenario Dotslinker has explained: Home users.

Home users are not going to buy a new computer just for the Kinect4A, Most of them will expect Kinect2 equivalent hardware. So forget about Myriad X, VSPs, and Gaming GPUs, these correspond to labs and hackers, not to the real world.

Chris45215 commented 4 years ago

Latency - this is a compute issue. What sort of latency do you think is acceptable from enqueuing the frame to popping the result?

I disagree that it is a compute issue, it appears based on my tests to be a bug. My data from processing a saved video showed a computation time of ~22ms per frame (RTX 2070 Super), so the latency between enqueing a frame and getting the result should be about 22ms. But my tests with live video showed that, even when running at a mere 5FPS, the system was returning results from 3 frames prior (with Microsoft's suggested code) or 2 frames prior (with my improvement on it). It sits on more frames when running at 30FPS.

Anyway, the entire thread is getting pretty bulky and unweildly. That's inevitable as it's not clear whether the body tracking latency is caused by delays in the camera, the depth SDK, or the body tracking SDK; but we should make new threads for separate issues. I would definitely say that the GPU requirement is worth its own threads - Microsoft made a major design error by requiring the user to have a powerful GPU rather than incorporating the CUDA cores and about 1GB RAM onto the camera, especially because any GPU with the requisite CUDA cores has at least 8GB RAM and other components that body tracking doesn't use, and thus the discrete GPU costs far more than it would cost for the needed components to be integrated to the camera. But that's an issue for CameraV2 to (hopefully) solve, if Microsoft decides to continue the product.

billpottle commented 4 years ago

As far as accuracy, our main issue is when legs are higher than the waist or moving quickly, ie, with kicks. This has been a separate thread before #(738) You can see an example here: https://www.my-ai-coach.com/t-c5-1586983138 Go to the wireframe below and adjust it - it looks very little like the video. In contrast, you can see much higher accuracy with slow moving hand techniques. https://www.my-ai-coach.com/t-c13-1586982826

Chris45215 commented 4 years ago

@dotslinker in your case, if you want to use something different from the Kinectv2 (which is a perfectly good sensor), I would suggest looking at the Orbbec Astra, or Intel's Realsense (which will require Nuitrack's software for body tracking). These are both low-footprint options in terms of code size, physical size, and computer requirements. The only caveat is that Orbbec's software can be a bit janky if your try to compile the source itself, as I believe the last time I tried it I discovered that Visual Studio was referencing file locations that existed only on their developer's computers (there is probably some easy solution for this, I've just not needed to go through it). I have been very impressed by Orbbec's offerings; and while their code isn't perfectly compatible with the Kinectv2 or K4A SDK libraries, it returns results that are very similar and thus it is relatively easy to adapt your code to accommodate Orbbec's data format. These sensors also have the benefit of being cheaper, and powered entirely by the USB. Also, Orbbec has a stereo camera version of its sensor which - while it does not perform tracking quite as well as the ToF version - multiple stereo sensors can be used without mutual interference. Though I've not programmed multiple Orbbec sensors to simultaneously work with one program, so I can't promise that there won't be some kind of resource conflict.

K4A's advantages are that it has the most reliable and accurate body tracking, even if it does have high latency, and that it most easily allows multiple sensors to work together without interference. The later is somewhat negated by the requirement of the GPU, as the most powerful GPUs on the market can only handle body tracking from 2 or at most 3 sensors simultaneously. But, it's a feature.

NickAtMixxus commented 4 years ago

Agree on many of the recent points that have been addressed here by @fractalfantasy @vpenades @dotslinker and more. @qm13: Thanks for detailed update. Regarding "What sort of latency do you think is acceptable from enqueuing the frame to popping the result?" I can only speak for my needs but for MixxusStudio interactive exercise, rehab apps/games I'd say the Kinect V2 latency is quite OK, it would not hurt if it was slightly better. In my experience the precision is also fine if the whole body is completely in its frustum, "visual field".

For interactive apps I would compare the need for short latency similar to playing virtual instruments. The real world is not instant, there is a slight inherant "latency" in an objects weight or inertia. As a comparison, if we want to experience that we are naturally playing an instrument the roundtrip latency needs to be around 20 ms or less, for some up to around 30 is acceptable but not much more, above that it starts to feel unnatural, most won't like to go above 10-20 ms. That is from pressing the keyboard to hearing the sound. To get a natural visual feeling of moving a virtual hand or interact with virtual objects the latency ideally should be somewhere there. (Our brains are somewhat slower to react to visual stimuli than to audio so it may make us more sensitive to delayed visual stimuli but I'm speculating now.)

What the Kinect for Azure is capable of regarding latency I don't know but if you could make it run like the Kinect V2 then please make such an option. I thought the Random Forest idea sounded interesting, give us Kinect V2 without the bulky cables. A kind of Easy Mode option so it can be available for a lot of people (without heavy systems) that really can make use of it. And if that would be fairly easy to implement perhaps we could get is soon? If I may add, I really wish we could get a fast Kinect soon cause it's hard to continue developing when K2 is not really an option anymore but you don't know what will become of K4.

qm13 commented 4 years ago

Thanks for all the responses. @cavisions yes, body tracking does not do well with kids under 8 years. This is primarily to do with body portions which do not become normal until around 8 years old (this according to a Dr we consult with). We are data limited. It is very hard to get captures of 2 - 7 year old kids.

@fractalfantasy whilst I agree regarding the K4W latency (being about 1/2 of AK) there other two you list are not body tracking uses cases. They are SLAM uses cases where you are predominantly reading an IMU and using vision to correct for IMU drift. K4W required GPU. The Xbox One reserved >10% of the GPU to run Kinect even if the game did not use Kinect. This was the heart of the PS4 vs One debate that raged after the game consoles launched. That said I get that you want 80+% of the GPU for graphics.

@dotslinker thanks for the excellent story behind your use case. We are aware of a number of companies developing in home rehab systems that are seeing an uptick in interest in the last couple of months. To my VSP point these companies are developing take home hardware that plugs into the TV. The expectation is that patients will be loaned hardware for the duration of rehab.

@PierrePlantard yes, you should be able to track a person from the back with the lite model.

@vpenades sorry but there are no plans to provide backwards compatibility for K4W APIs. Assuming we do add support for random forest it will be through the current AK APIs.

@billpottle we have added high kick (and a few other extreme positions) to the training and testing data sets. Thank you for your contribution here. The high speed motion is more problematic and is not a body tracking issue per se. The Azure Kinect camera exposure time is longer than the K4W resulting in more motion blur. This was a HW design trade off that we hope for fix in the next generation of sensors

@Chris45215 there is a separate GitHub issue tracking device and Sensor SDK latency here #816 We realize that there are alternatives to Azure Kinect and SDKs. If Azure Kinect currently does not work for you then you should absolutely look at alternative technologies. The Azure Kinect team is only at the start of its journey to democratize depth cameras.

qm13 commented 4 years ago

@NickAtMixxus sorry, missed you in the reply. Thanks for the feedback. Regarding the sort of latency you mention for playing a simulated instrument. These sorts of latencies are achieved with scenario specific hardware and tightly coupled software. Two examples are the Xbox Controller and HoloLens. Both have E2E latencies of <10ms made possible by magic. That said the body tracking team definitely wants to beat dad.

fractalfantasy commented 4 years ago

@qm13 thanks for the corrections - I've updated the expectation list below.

K4A Body Tracking Expectations:

1. FPS - expected at 30FPS on modern Intel CPU like Kinect v2 2. LATENCY - expected to match or beat the latency of Kinect v2 (0.12s) 3. CROSS-PLATFORM - expected to run on any ~$1000 laptop from last 5 years like Kinect v2 4. RESOURCE HOGGING GPU - expected to only take up 10-20% of GPU like Kinect v2 5. ACCURACY - expected to maintain accuracy during movement like Kinect v2 6. POWER EFFICIENCY - expected to run passively w/ reasonable power consumption like Kinect v2


TL;DR Beat dad!! & provide random forest option in the meantime.

vpenades commented 4 years ago

@qm13

sorry but there are no plans to provide backwards compatibility for K4W APIs. Assuming we do add support for random forest it will be through the current AK APIs.

Ah, I didn't mean the APIs to be the same, I was talking more in terms of what to expect from the SDKs. We don't mind the APIs to change between versions. In fact, I very much appreciate that the new Sensor and Body Tracker SDKs are separate APIs.

My point is that the Kinect4A should have shipped with a body tracking equivalent to what the kinect2 delivered, in terms of precission and performance. let's call it _K4ALegacyK2BodyTracker So for a while, and until hardware catches app, the Kinect4A could be used as a replacement of the Kinect2, without requiring to upgrade the rest of the hardware.

This would have allowed more people to early adopt the K4A camera, while giving you more time to mature the new DNN model, without the pressure to deliver something in a hurry.

Consider that currently, many of our clients are running the Kinect2 with CoreI5 2.9Ghz, and in many cases with in built Intel HD graphics. I understand the new "Lite" model will not work with this hardware setup.

dotslinker commented 4 years ago

@qm13 I absolutely subscribe the last posts of @vpenades and @fractalfantasy. I think we all need a lighter version of the current Body Tracking SDK, to be used with "standard" PC, not equipped with a GTX 1070 or above.

One question: while we wait for a less GPU-intensive option in the next K4A SDKs, would it be an option to use something like Nvidia Jetsons Nano as a "decoder" of the body joint positions (i.e. doing most of the DNN related processing)? There used to be a thread (#871) regarding the ARM compatibility of the source code , but it's closed and I don't know what's the development state...

qm13 commented 4 years ago

@fractalfantasy thanks for the nice summary.

@dotslinker there are two parts to 3D body tracking - DNN (GPU) and human model fitting (CPU) - neither are performant from the team and community perspective. Splitting them does not help. That said we are working on an ARM implementation of the body tracking SDK.

fractalfantasy commented 4 years ago

@fractalfantasy thanks for the nice summary.

No problem, do you have an update on the random forest option?

Would be great to know if this request has been taken seriously, or if we'll have to switch sensors.

shevart75 commented 4 years ago

2 weeks later, no any movement... I guess Microsoft is now already planning to abandon Azure Kinect, just like they did with kinect v2. The story repeats... Microsoft makes great hardware but unvalues it with bad software, lack of support and stupid manager decisions.

gradientLord commented 4 years ago

@qm13 @wes-b can you please give us an update on the random forest option.

and if not.. please tell us know how you plan to beat kinect 2 performance, efficiency, and dynamic accuracy soon.

shevart75 commented 4 years ago

The issue persists for almost one year. Sooner the better solution will be (or is already?) provided by competitors...

Chris45215 commented 4 years ago

As far as I can see, the Azure Kinect team has yet to say if they have been able to reproduce the latency that I demonstrated in https://youtu.be/7Jc7KhoPWdc. I hope that the latency lies entirely within the body tracking SDK, as the depth sensor latency thread was just closed. If they modify their example code so that it outputs a text line after every enqueue and also after every dequeue, I think they will find that it enqueues several frames before it dequeues any results.

The lack of any explanation of why the body tracker is returning stale results rather than the freshest ones is not encouraging. The video demonstrates it very well. I found a workaround that partially improves it, by dequeueing twice at the start, though that workaround still has not been added to the documentation at https://docs.microsoft.com/en-us/azure/kinect-dk/get-body-tracking-results (so the "real time" example is one that is unnecessarily far behind real time, and it does not drop a frame that it should drop). I hope the issue can be investigated and resolved at some point in the future.

Also, I suggest that anyone wishing to use the sensor for real-time body tracking should set it to 15FPS rather than 30FPS, as our tests suggest there is less latency at 15FPS. I could attempt to measure and quantify the latency at those framerates as we may be incorrect, but we have committed a lot of time to the issue already.

Perhaps I should add that the solution to all of this, from the start, was to integrate the CUDA cores and about 1.5GB RAM into the sensor. Maybe that would exceed the thermal budget, but the more direct route to the neural network would simplify the debugging process for this. The sensor may cost more with that integration, but as-is the system requires a discrete GPU just to use its CUDA cores. I would be happy to lose the microphones in exchange for integrated processing.

fractalfantasy commented 4 years ago

The issue persists for almost one year. Sooner the better solution will be (or is already?) provided by competitors...

@shevart75 The new Intel L515 sensor may be comparable to the Azure Kinect but I’m not sure if Intel/Cubemos body tracking is that much of an improvement. I’ve already checked and getting third party body tracking to work with our current architecture is also quite impossible.

Bottom line is we’ve sunk thousands of dollars getting Azure Kinect sensors for our dev team, expecting the SDK would be equivalent or better than its predecessor and we’ll have to return them if we end up switching to a competitor. I’d really hate to abandon this sensor because of such an easy to fix problem.

Right now our application is half-built awaiting a hero from the body tracking team to spend an afternoon implementing the old model. I‘m still optimistic that the Azure Kinect DK project leaders value the democratic spirit of open-source, this is the most asked for body tracking feature after all, so haven't entirely given up hope yet.

shevart75 commented 4 years ago

"... awaiting a hero from the body tracking team to spend an afternoon implementing the old model. " 👍 but they already clearly answered: we considered it but we will stick to a slow-working DNN solution, just because we spent too much time on developing it. So they would rather spend one more year with DNN instead of "an afternoon" for the old model, just like this... Oh, dear...

shevart75 commented 4 years ago

Look what we see... qm13 wrote: "whilst I agree regarding the K4W latency (being about 1/2 of AK)....." "The Azure Kinect camera exposure time is longer than the K4W resulting in more motion blur. This was a HW design trade off that we hope for fix in the next generation of sensors"... so, in other words, k4w performed better, but we had to drop it for azure, which is worse but costs more, but we already have to think of next gen... make your conclusions...

fractalfantasy commented 4 years ago

@shevart75 I see what you’re saying, but 2 somewhat recent quotes from @qm13 lead me to believe they’re considering the random forest option:

“Regarding providing a random forest option we will investigate what it will take to port the model to Azure Kinect.”

“Assuming we do add support for random forest it will be through the current AK APIs.”

dotslinker commented 4 years ago

I have been working for a while with RealSense D435 and Cubemos.Honestly not the easiest library to work with, but the Cubemos guys are kind and open to promptly answer to all the technical questions one may encounter (from setting up the programming environment to playing with the source examples). The library's performances can be adjusted to the customer's needs by changing the number of nodes used for the processing. The higher the number of nodes used, the more accurate the measures (and the more stable the reconstruction of the coordinates), and (of course) the lower the fps. The small tests I have been doing seem to be encouraging, although a bit more noisy and with a number of joints (18) lower than the Kinect Azure (32).

Pros: usable now and working with reasonable hardware (the Cubemos library seem to exploit the integrated GPU on the Intel CPU rather than being based to the external GPU - like the current Azure Kinect DK)

Cons: less joints, measures a bit "noisy" (especially with 128 nodes NN)

I do hope there will be some update of the current Azure Kinect DK...

@qm13: checking the load of the body tracking current DK executable example on my laptop (with a GTX 1650 GPU - GB VRAM), I was surprised to see that it mostly exploits the Intel integrated GPU rather than using the discrete GPU... is that normal? Is there a way to force the processing to be performed on the GPU?

qm13 commented 4 years ago

@dotslinker sorry for the tardy reply. That is not expected. ONNX Runtime v1.2 only uses the CPU or NVIDIA GPU. Are you sure that the Intel GPU utilization is body tracking?

qm13 commented 4 years ago

All, the team has just completed planning for the next semester. We revisited supporting random forest and concluded that it does not fit with the direction the team is taking body tracking. Instead we are focusing on a more performant DNN.

Chris45215 commented 4 years ago

@qm13, I may be the only person here satisfied with that answer. Has anyone at Microsoft tried to reproduce the test I posted at https://youtu.be/7Jc7KhoPWdc yet? Considering that I found that Microsoft's example code for realtime, lowest-latency-possible output has an error that causes it to buffer an extra frame and thus always be 1 frame further behind than it could be - and the example code has yet to be updated with a correction - I question the testing (if any) of bodytracking latency.

I am still of the opinion that either the DNN is designed to withhold the result until it gets later frames (thus returning a 'stale' result), or some part of the firmware or SDK is delaying frames when it shouldn't be. I could partially test this by creating 'artificial' frames (a saved video+depth feed that is edited and re-arranged by hand) to feed into the DNN, which would let me more easily catch it delaying frames if the delay is on the DNN side. This would take far more than a reasonable amount of my time to do, and I don't think the editing toolset for that even exists so I would need to create it. If my suspicions are correct, then it provides a much easier route to greatly improve the latency issues.

Beyond that, I think we all know what the real, long-term solution must be: integrate the CUDA cores and about 1GB RAM into the sensor.

gradientLord commented 4 years ago

this is fine

Chris45215 commented 4 years ago

I created issue #1253 to suggest integrating the body tracker into a V2 of the sensor. That would solve all of the problems. It may be solving it by increasing the price of the sensor, but I'd pay $4000 per sensor if it had integrated body tracking and 50ms body tracking latency.

https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1253

Beyond that, I stand by my statement that the current Nvidia-GPU dependent approach is fine for me IF the latency issue can be fixed. My GPU processing time is 22ms per frame, but that does not jive well with the real-world measured (and demonstrated) latency of the bodytracking latency.

PierrePlantard commented 4 years ago

@qm13, thank you for the update. For us, IF the new DNN version allow us to run with consumer grade laptop (without Nvidia GTX GPU) and maintain the accuracy level of the current DNN, it's fine ! Do you have an expected release date for this new version ?

knat commented 4 years ago

I'm using RTX 2060 super with a single kinect, the frame rate is almost 30 FPS and the latency is acceptable.

bastiankayser commented 4 years ago

@qm13 Thank you for the update and the transparency. Nevertheless I must say that I am deeply disappointed by this decision. You have a dedicated community that is begging on all possible channels for a funtionality that was already implemented years ago because otherwise alot of interesting and serious applications, especially in the health sector, are not feasible. And your (the development teams) answer is: "Sorry we want to play with the current hot stuff" namely DNNs. I get it, it is a fascinating technology but you are ignoring all the people that actually want to use your sensor for something useful that can have a positive impact on many peoples life. Please rise to your responsibility as the technology leader in this space that you still are and give the people what they want and what they had before: Fast, reliable body tracking on low to mid level hardware. Thank you.

fractalfantasy commented 4 years ago

@qm13 @wes-b Please mark the feature request as ‘unplanned’, to show the dev community how much you value their input.

This is the most discussed issue and second most requested feature - porting the option over would be a breeze for any decent developer.

rfilkov commented 4 years ago

If there is nothing else to add here, I'd like to close this very long and fruitless thread. A year later, it's also a bit too late to start implementing the most wanted (or any) feature requests.

Chang-Che-Kuei commented 3 years ago

My GPU is Nvidia RTX 2060. It runs at about 30 FPS.

vpenades commented 3 years ago

@Chang-Che-Kuei Developers switching from Kinect2 to Kinect4A expect similar performance on similar hardware, which is not the case with Kinect4A.

fractalfantasy commented 3 years ago

They promised a lighter, more performant DNN model, that works across different GPU manufacturers and then completely abandoned development.

RoseFlunder commented 3 years ago

It is also a bit worrying that there is still no compatible release of the body tracking sdk with the new NVidia 3000 series and its a known issue since obtober: https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1125

Or that they never released Ubuntu binaries for a body tracking sdk that is compatible with sensor sdk 1.4.

vpenades commented 3 years ago

@fractalfantasy @RoseFlunder as far as I know, they were still on it with the hopes of a late 2020 release, but it seems it's been delayed. But given microsoft track record of announcing projects cancelations, it would very well be the case that it's been silently canceled.

Anyway, on our side, we moved on to look for alternate solutions that don't require this BodyTracking nor the K4A camera. We simply could not afford waiting a whole year for a solution.

fractalfantasy commented 3 years ago

@vpenades may I ask if you found any alternate solutions? I was hoping that maybe Nuitrack would support the Azure Kinect, but doesn't seem to be the case.

I looked into other sensors couldn't find any competitor that has good resolution... Intel sensors aren't even comparable to Kinect v2.

vpenades commented 3 years ago

@fractalfantasy I can't comment, sorry. But our solution is tailored to our use case.

It's still not as good as what the Kinect2 delivered, though, so if the kinect team finally delivers something that really improves (a lot) on what's currently available, then we might switch back to it.

rfilkov commented 3 years ago

@fractalfantasy There are not many options - Intel RealSense, Orbbec (their new sensor looks promising), Apple's iPad-Pro/iPhone-Pro (with some limitations). Kinect-v2 is still the best one out there. Unfortunately, MS are famous for ruining or canceling their best products, as a result of not listening. Deja vu.

L4Z3RC47 commented 3 years ago

I would be surprised if the Azure Kinect project was cancelled. MS has been active on responding to many issues as recently as a day or two ago. Though the silence on issue #1125 is frustrating...

rfilkov commented 3 years ago

@L4Z3RC47 Yes, they provide some minimum of customer support. But look at the progress of development here: https://feedback.azure.com/forums/920053-azure-kinect-dk It hasn't moved an inch since one year ago.

vpenades commented 3 years ago

@rfilkov Not only the feedback page hasn't moved. There's not been any commit to the main repository since 1-july-2020. And I'd be surprised if they consider the driver side "completed"

gradientLord commented 3 years ago

@L4Z3RC47 They answer these forum posts so that senior management thinks they are being active, and then go off and play video games.

vpenades commented 3 years ago

I don't think that's the case... nevertheless, if there's been delays to deliver, at least they could be mitigated with better communication.

fractalfantasy commented 3 years ago

@rfilkov wow this new orbbec sensor looks like an Azure Kinect, wonder how it will compare resolution-wise

Ero5GwHXAAAdvk6

andrey-tsb commented 2 years ago

@vpenades may I ask if you found any alternate solutions? I was hoping that maybe Nuitrack would support the Azure Kinect, but doesn't seem to be the case.

If you are interested, Nuitrack currently supports Kinect Azure and provides fast CPU-only skeletal tracking with it.