NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
https://nvlabs.github.io/FoundationPose/
Other
1.34k stars 173 forks source link

I'm getting errors for "build_all[dot]sh" and "run_demo[dot]py" with ubuntu 22 in wsl 2 on windows 10. Please point me in the right direction. #208

Closed ahiyantra closed 2 weeks ago

ahiyantra commented 1 month ago

I was unable to make the officially prepared docker image run on my device, so i followed the recommended steps to prepare my own docker image but i still continue to get errors. The output for "nvidia-smi" is included near the end of the copy-pasted terminal messages for reference, which are linked as a github gist. Please help me figure out how i can solve these problems.

github gist with terminal messages

ahiyantra commented 4 weeks ago

I was able to get rid of the error for "build-all[dot]sh", which was caused by failures during building of "kaolin" by commenting out the corresponding lines because i don't plan to use the model-free setup. The solution was in a resolved issue linked below.

the context

ahiyantra commented 3 weeks ago

It seems that the docker config file creates an image with python version 3.8, which results in quite a few errors because many libraries in the requirements text file need python version 3.9, so changing the docker config file accordingly before creating an image can reduce the number of errors drastically for "run_demo[dot]py"!

the context

ahiyantra commented 3 weeks ago

I face an error for "nvdiffrast" that was resolved by shifting to where this library is located & rebuilding it separately but a different error for "nonetype"/"mycpp" followed soon.

ahiyantra commented 3 weeks ago

I was able to make the first python demo script run without errors by executing the same bash commands as my previous attempt in a slightly different order with some repetition (solution for nonetype/mycpp issue with "build_all_conda[dot]sh", followed by solution for the nvdiffrast issue, followed by solution for the nonetype/mycpp issue with "build_all[dot]sh") after setting the values for system variables of NINJA_JOBS & MAKEFLAGS to "1" & "-j1" respectively. The second python demo script still seems to have an error. It stops suddenly with a tensor/array of numbers with decimals.

ahiyantra commented 3 weeks ago

I found a solution after analyzing the code. The "cv2.waitKey()" function waits for a key press for a specified amount of time in milliseconds. In the second python demo script, a duration of 1 millisecond is specified for it. If a key isn't pressed within this time, the function continues execution & as a duration of 1 millisecond is very short, the window closes almost immediately after showing the single image. To keep the window open until the user manually closes it, we should replace "1" with "0". The duration of 1 millisecond is typically used for video stream processing to allow for a smooth display while still being responsive. The duration of 0 millisecond is used like a proxy, allowing a single image to be shown without the window closing abruptly.

ahiyantra commented 3 weeks ago

There was an error that says "ninja: build stopped: subcommand failed". It was resolved by installing a suitable version of the cudnn package compatible with the cuda driver. Still, although there are no errors, the second demo doesn't work as intended. The bounding box & the pose axes aren't being visualized properly. Is it perhaps because there are too many bottles in the image provided to the model? Would only one bottle be detected properly?

ahiyantra commented 2 weeks ago

It seems that this implementation of this model is meant for detecting only one object at a time in a provided image, so other methods are needed to detect multiple objects in a provided image before the list of detected objects is passed to this implementation of this model in order for them to be processed one at a time. Finally, if a demo script crashes upon running for the first time, then sometimes, it can be made to work by running it again once. Beyond that, there are no more errors now.