microsoft / Azure-Kinect-Sensor-SDK

A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
https://Azure.com/Kinect
MIT License
1.49k stars 619 forks source link

Please don't make executables for the SDK, that are nothing more than drivers. #1422

Closed ChemicalNRG closed 3 years ago

ChemicalNRG commented 3 years ago

The Azure Kinect is the worst experience of "developer/thinker/new on the market" product ever for me. First i thought it was nice that there was an installer for windows. But then i found out, that after days of trial and error, (with different programs/compilers) it is just an incomplete package, so no sdk at all.

Then after downloading the complete/real SDK and finding out what the best way to compile was, at first i didn't notice the files because of the strange folder structure after compiling. Its an example in a bundle in another bundle if i have to explain the structure. But in which folder i could find the compiled exe or which things i have to compile before i could compile the exe was completely unclear for me. Even now i know it is in the folders, i can't find it the first try.

I did know that it is not a "normal/finished" product, but loving to experiment with electronics and after seeing multiple examples (some exactly what i was looking for), i wanted to go for it.

Also, after recording 2 mkv's for mapping in a factory i found out that:

  1. MKV 1: after a sudden ending with the recording (battery empty) the mkv is useless except extracting depth and color images with ffmpeg. MKV toolnix cant do anything with it. The only program that can read some info of it is ffmpeg/ffprobe. But copying the tracks to another container doesn't give me the info (duration, framerate, meta info) that is needed by other programs to read the file properly. For all other programs its just an empty file.
  2. MKV 2: After recording just 6 and a half minute only the k4aviewer can handle it (and i can watch the color video with vlc and mediaplayer classic). One of the examples (transformation) could only handle the first xxxx frames. The rest could not do anything with it.

I know, "they are examples, you cant expect anything from it", but i just wanted to share my experience with it, AND: thinking that the executable sdk was the safest bet, was the worst call.

RoseFlunder commented 3 years ago

I don't really understand what you mean with that the installer is only a driver? The installers you can download are containing everything you need to develop an application that uses the azure kinect: https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/develop/docs/usage.md#msis

It even states where the files are located:

The installer will put all the needed headers, binaries, and tools in the location you choose (by default this is C:\Program Files\Azure Kinect SDK version\sdk)

Same Installer is linked at the Microsoft website: https://docs.microsoft.com/en-us/azure/kinect-dk/set-up-azure-kinect-dk#download-the-sdk

So if you download the "Azure Kinect SDK 1.4.1.exe" for example and install it, you don't only get some tools like the k4aviewer or k4arecorder but also the headers and library files you need to compile your own applications with.

Alternatively microsoft also describes how to develop with Visual Studio and a Nuget Package: https://docs.microsoft.com/en-us/azure/kinect-dk/build-first-app https://docs.microsoft.com/en-us/azure/kinect-dk/add-library-to-project

There is no need to clone this repository and compile the SDK yourself if you don't need specific changes to the SDK/API itself.

ChemicalNRG commented 3 years ago

Try to compile transformation or opencv example after that install and you know what i mean. Maybe you have done much compiling i didnt, and i think i will not be the only one. I have a couple of times but than it was clear and everything was complete, or it was clear that i had to get extra files to get things working. In this sdk not at all, and by the time a learned that it was hours/days later. Looking it from the bright side, i have learned a lot about compiling because i have tried it a lot in the past days, i now even know some errors that can occur.

RoseFlunder commented 3 years ago

Well examples that use open cv obviously only works when you add the open cv dependency.

The SDK itself doesn't have open cv in its own installer and it shouldn't because its a general computer vision library thats not from microsoft. The SDK contains an API and implementation for configuring, starting and retrieving the k4a sensors data as the name "Sensor SDK" already hints at. So in that way the SDK itself is complete, you can develop own applications that can configure, start and retrieve data from the k4a. What you do with it is your own responsibilty. When you want to display it on a screen, you need additional code and/or dependencies for that of course. If you want to use it combination with open cv the examples show you thats its possible, but of course you have to add open cv for that as well.

But you can also develop things applications like:

  1. retrieve the capture
  2. process it with the body tracker
  3. send the body tracking results somewhere or store them in a file

These applications doesn't need something to display the images on screen or doesn't need anything from opencv, just the SDK + body tracking SDK.

I think you just misunderstand what a SDK should contain. It is not meant to provide you APIs for third party libraries. You are just a bit confused because some examples make use of third party libraries and show a way how to interact with them after retrieving data from the Sensor SDK.

The transformation example uses libjpegturbo, a third party library, to decompress JPEG images when loading a MKV file and not using the live captures from a device. But as I said, libjpegturbo is a thrid party library thats not from Microsoft and not necessary for every k4a application. So the example just shows how to interact with the libjpegturbo in that way.

Edit: Instead of decompressing the image with libjpegturbo you could also rewrite that part to use the color conversion feature of the SDK itself: https://microsoft.github.io/Azure-Kinect-Sensor-SDK/master/classk4a_1_1playback_a8566e35942f2d62be8b1e62214373b38.html#a8566e35942f2d62be8b1e62214373b38

That way you will get BGRA color images again and don't have to decompress the jpeg images yourself.

Edit: When you want to develop own applications that use the k4a, then you definetly shouldn't clone this repository and develop them inside this structure. Thats not what this repository is for. You should always setup your own bare project and then add the required dependencies. The SDK + whatever third party libraries you want to use.

ChemicalNRG commented 3 years ago

Well examples that use open cv obviously only works when you add the open cv dependency.

The SDK itself doesn't have open cv in its own installer and it shouldn't because its a general computer vision library thats not from microsoft. The SDK contains an API and implementation for configuring, starting and retrieving the k4a sensors data as the name "Sensor SDK" already hints at. So in that way the SDK itself is complete, you can develop own applications that can configure, start and retrieve data from the k4a. What you do with it is your own responsibilty. When you want to display it on a screen, you need additional code and/or dependencies for that of course. If you want to use it combination with open cv the examples show you thats its possible, but of course you have to add open cv for that as well.

But you can also develop things applications like:

  1. retrieve the capture
  2. process it with the body tracker
  3. send the body tracking results somewhere or store them in a file

These applications doesn't need something to display the images on screen or doesn't need anything from opencv, just the SDK + body tracking SDK.

I think you just misunderstand what a SDK should contain. It is not meant to provide you APIs for third party libraries. You are just a bit confused because some examples make use of third party libraries and show a way how to interact with them after retrieving data from the Sensor SDK.

The transformation example uses libjpegturbo, a third party library, to decompress JPEG images when loading a MKV file and not using the live captures from a device. But as I said, libjpegturbo is a thrid party library thats not from Microsoft and not necessary for every k4a application. So the example just shows how to interact with the libjpegturbo in that way.

Edit: Instead of decompressing the image with libjpegturbo you could also rewrite that part to use the color conversion feature of the SDK itself: https://microsoft.github.io/Azure-Kinect-Sensor-SDK/master/classk4a_1_1playback_a8566e35942f2d62be8b1e62214373b38.html#a8566e35942f2d62be8b1e62214373b38

That way you will get BGRA color images again and don't have to decompress the jpeg images yourself.

Edit: When you want to develop own applications that use the k4a, then you definetly shouldn't clone this repository and develop them inside this structure. Thats not what this repository is for. You should always setup your own bare project and then add the required dependencies. The SDK + whatever third party libraries you want to use.

I know that, i have that all working now by git cloning the SDK. It even copied all necessary opencv dll's to the example folder Did you at least try the installer and compile transformation and other examples?

That Transformation used a different method i did not know. I thought it used microsoft's recorder, so also thougt that the recorder used libjpegturbo. Thanks for the info. Doesnt microsoft use any of these third party software? Like libsoundio and libusb?

Edit: I made a mistake, it was the git repo Azure Kinect Samples that didnt work (and is a mess) because of missing cmake files or additional depencies, after following the steps from the "opencv kinfu samples". But it worked when i git cloned the kinect sensor sdk. Opencv and vtk where allready installed.

For someone that is new to all this, i would think that the exe/installer will give you the same set of files as using git clone for the sensor SDK. And that is not the case, so thats why i didnt like the installer which is incomplete if you compare it with git cloning the SDK.

So why not include all files in the installer? Or why include depthengine in the installer but not in the git repo? Why depthengine in exe but linux users have to copy it from a folder. Why not make good, standalone compiling files for the examples when you make a separate repo like the Azure Kinect Samples.

Edit 2: A lot of errors in the Azure Kinect Samples didn't even had to do with the example i did want to compile, i think.

RoseFlunder commented 3 years ago

The workflow is not to clone this repo. The examples are not part of the SDK itself, they just show how to use the SDK and also use other thrid party libraries which you should add to you projects yourself if you want to use them as well. There is nothing wrong from using other third party libraries to show some example use cases.

I already explained what the SDK contains and what for what it is: API and implemenation to configure & start the k4a, get its sensor data like camera captures and IMU data and offer capabilities for recording/playback.

Although Microsoft uses third party libraries internally inside their SDK, like libusb for example, its completly uncommon to deliver headers and lib files for them to SDK users. The internally used libraries are statically linked, this means that all used routines from the third party libraries will be copied into the resulting binaries of the SDK, BUT its common practice to only make your own API publicly visible, not any third party library APIs that your code is using internally.

When I want to get the transformation example running, which uses libjpeg-turbo as a third party lib, I do the following steps on windows with visual studio when not using nuget packages (and yes, i tried it myself):

  1. Download and install the sensor sdk using the common .msi installer like "Azure Kinect SDK 1.4.1.exe"
  2. Download libjegturbo binary installer ("libjpeg-turbo-2.0.6-vc64.exe") from the official libjpegtubro website: https://libjpeg-turbo.org/Documentation/OfficialBinaries
  3. Create a new empty visual c++ project with Visual Studio
  4. Copy just the main.cpp, transformation_helper.cpp, transformation_helper.h from the transformation example to your new project
  5. Add the "include" folders from the installation directorys from the sdk and libjpegturbo from step 1 and step 2 to your new project (Project Properties -> C/C++ -> General -> Add Include directories)
  6. Add the "lib" folders from the installation directorys from the sdk and libjpegturbo from step 1 and step 2 to your new project (Project Properties ->Linker -> General -> Additional Library Directories)
  7. Specify the libraries to link: (Project Properties ->Linker -> Input-> Additional Dependencies -> Add k4a.lib, k4arecord.lib, turbojpeg.lib)
  8. Build your project
  9. Copy the dll files needed during the runtime of your application to your build output folder: depthengine_2_0.dll, k4a.dll, k4arecord.dll, turbojpeg.dll (all provided by the sdk installer & libjpegturbo installer from step 1 and 2)
  10. Now you can run the application from the command line

No need to clone and build the whole SDK yourself just to get an example running. You just need the installer for the SDK and need to add the installed header and library files to your project, link them and copy the dll files to your output folder. Nuget packages can simplify this process and that workflow is also explained on the MS homepage. Additional third party libraries, like libjpegturbo, need to be added seperately because its just a general purpose third party library for compressing/decompressing JPEG images and therefore doesn't belong to the API from the sensor sdk.

RoseFlunder commented 3 years ago

I would agree though that the Azure Kinect Samples repository could be maintained better and that the examples could be moved to that repository.

ChemicalNRG commented 3 years ago

I would agree though that the Azure Kinect Samples repository could be maintained better and that the examples could be moved to that repository.

I think that would help a lot of us. Don't get me wrong, i know i have to change any code that i use to make it do what I want, for my use case. But because i have not that much experience to write all the code, i was searching for an example that at least comes near the things i want. I am someone that is really patient to do trial and error, till it is working. But this is one of the first times, i lost the fun in it completely, before even getting any results of the Kinect. While i think it is capable of the things i want it to. I did get ROS with RTAB-MAP working though, but the results from that does also not come near the things i want (till now atleast).

Thanks for the best explanation on how to approach compiling the examples. It had saved me days if all the steps/walkthroughs for running examples where that clear.

Maybe someone here can point me in the direction which probably comes close to the things i want with the kinect? I want to use the kinect for scanning rooms/building, to get an as dense as possible pointcloud.

What i want from that is:

  1. 2d plans (with angles between walls) and height.
  2. Measure some random things (at home) like windows (frames).
  3. And if thats working allright, use software (with recognition) to make a 3d file/project from the cloudpoints (for VR and or designing and building cabinets).

Till now i have looked at Ros with RTAB-MAP, Open-CV kinfu example, Open3D (not working yet), Brekel PointCloud, TouchDesigner, RecFusion and wanted to have a look on Unity. I think i am a bit lost in which approach will give me the best results without to much time spending on it. What i do know my best bet is (i think), is linux/ubuntu for the mapping part because all windows approaches i saw is closed source and running it on a LattePanda Alpha 864 (portabillity) was better on the linux than windows examples/programs, except just recording the mkv's, but that gives me no feedback that all things are scanned propperly or not. And i also ordered a Jetson AGX to replace the LattePanda which is running Ubuntu.

RoseFlunder commented 3 years ago

I have no experience in your use cases, but I would guess as you are already using ROS you have found the Azure Kinect for ROS Github Repository? https://github.com/microsoft/Azure_Kinect_ROS_Driver

Maybe you can find some experienced mapping people over there because, as you already stated, mapping is often used on Linux/ROS/roboting applications.