stereolabs / zed-python-api

Python API for the ZED SDK
https://www.stereolabs.com/docs/app-development/python/install/
MIT License
213 stars 94 forks source link

Issues building AI model Jetson Xavier NX #161

Closed sophierahn closed 2 years ago

sophierahn commented 4 years ago

I've been using the ZED 2 camera on a Jetson Nano for a few months now, I recently switched over to a Jetson Xavier for the added ram and processing power but my attempts to use the ZED2 haven't been working. I downloaded the ZED SDK for jetpack 4.4 (the standard installer) then followed the instructions to download the python api. I went through the tutorials installed and everything is alright until the Object Detection module - which is also the one I really need.

The program begins downloading and setting up the model for the graphics card and ultimately fails.

"Terminate called after throwing an instance of 'std::bad_alloc what(): std::bad_alloc Aborted"

I compiled and ran the cpp file of the same module and everything worked well. I tried reinstalling all the software and then got lots of segmentation faults.

I've verified I have cython, numpy, and cv2 downloaded.

Do you have any ideas on what could be wrong? I know bad_alloc refers to a lack of memory, so I rebooted the machine and called the tutorial module through terminal without opening any other programs. There really SHOULD be enough memory for it.

Thanks so much

adujardin commented 4 years ago

In the current state, are the C++ samples working?

Could you give the commands and output of these you're using to set up the python API? Could you also give the log of the sample that is crashing?

If a model has been set up when using a C++ sample it should be available when using the Python API, you should see the files with ls /usr/local/zed/resources/.*

sophierahn commented 4 years ago

The c++ code is working right now.

To set up the api, I installed cython first with sudo python3 -m pip install cython

Numpy I had to monkey around a bit with to get numpy 1.19 instead of 1.13, but it was also installed with sudo python3 -m pip install numpy

I tried running both again to confirm and got the following returns: Requirement already satisfied: cython in /usr/local/lib/python3.6/dist-packages (0.29.21)

Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (1.19.4)

So those dependencies are in place.

Next I went into usr/local/zed and ran: python3 get_python_api.py

with an output of:

Defaulting to user installation because normal site-packages is not writeable Processing ./pyzed-3.3-cp36-cp36m-linux_aarch64.whl Installing collected packages: pyzed Successfully installed pyzed-3.3

Next I ran hello_zed.py:

[ZED][Init] No calibration file found for SN24659869. Downloading... [ZED][Init] Calibration file downloaded. Hello! This is my serial number: 24659869

Then I ran object_detection.py and got:

[ZED][Init] Depth mode: PERFORMANCE [ZED][Init] Video mode: HD720@60 Object Detection: Loading Module... terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

A model is present in usr/local/zed/resources. I'm not sure how to know if it's been set up correctly other than running object_detection.py as a test

sophierahn commented 4 years ago

I've dug into it a bit more, each time I run the module as a python script, the CPU usage skyrockets. Sometimes the program exits with a segmentation fault. From what I understand this can be an issue on the C level. It's very possible something is wrong with Xavier - but is it possible a recent update to the code base is creating issues?

I found a python module to peek into the error message beyond segmentation fault and the output was: (the script name is different but its a copy-paste of the program)(and the number printouts are from the debugging process) python3 -Xfaulthandler zed.py

[ZED][Init] Depth mode: PERFORMANCE [ZED][Init] Video mode: HD720@60 Object Detection: Loading Module... one two three four Fatal Python error: Segmentation fault

Current thread 0x0000007f995f3010 (most recent call first): File "/home/user/Documents/Code/sw-droid/Archive/zed.py", line 72 in main File "/home/user/Documents/Code/sw-droid/Archive/zed.py", line 110 in Segmentation fault (core dumped)

paolodesa commented 3 years ago

Hello! I'm also experiencing your same problem with the NVIDIA Jetson Nano (4GB). When I try to run the object detection module the program crashes with the error "std::bad_alloc". I've narrowed down the issue to this specific line: "for obj in bodies.object_list:" I'm looping over the bodies found in the image to extract the coordinates. The property "object_list" of the class "Objects" is defined in the python API as:

@property def object_list(self): objectlist = [] for i in range(self.objects.object_list.size()): py_objectData = ObjectData() py_objectData.object_data = self.objects.object_list[i] objectlist.append(py_objectData) return objectlist

Maybe the issue lies in the way the memory is allocated in order to build the list of "objectData", which is then returned. The strange thing is that I only have this problem on the Jetson, while it works fine on my Desktop PC. I was wondering if you have found any additional info on the bug.

sophierahn commented 3 years ago

While this doesn't address the cause, I found reverting back to SDK 3.2 from SDK 3.3 solved the issue. There seemed to be significant changes to the object detection examples between 3.2 and 3.3. Its not a fix but it is a solution.

paolodesa commented 3 years ago

While this doesn't address the cause, I found reverting back to SDK 3.2 from SDK 3.3 solved the issue. There seemed to be significant changes to the object detection examples between 3.2 and 3.3. Its not a fix but it is a solution.

Thanks a lot for the suggestion. As you said it works on the previous version of the SDK.

paolodesa commented 3 years ago

I've found out that building the Python API from sources solves my issue with the ZED SDK 3.3.

qt-truong commented 3 years ago

Hi,

The ZED SDK 3.4 RC should solve this issue

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment otherwise it will be automatically closed in 5 days