Closed IsidoraR closed 2 years ago
I am not expert in Docker. Can you take a look at this Dockerfile and see if it helps: https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/Dockerfile
Yes, when I build the Dockerfile in the edgeai-tidl-tools repo, the wheel packages are installed without any errors.
I'm not using Docker because my environment is already inside a VirtualBox VM. So, I don't want to add yet another layer.
However, I found many issues with dependencies and reproducibility with the Benchmark repository. To get it working, I had to do this:
Clone the repository: git clone https://github.com/TexasInstruments/edgeai-benchmark.git cd edgeai-benchmark
Create & activate the Conda environment conda create --name ti-edge-ai-benchmark python=3.6 -c conda-forge conda activate ti-edge-ai-benchmark conda update -n base -c defaults conda conda install pip
Proceed with the installation
./setup.sh
Before running, please edit the file and pin the version of graphviz to 0.8.1
in the requirements_pc.txt
file, otherwise it won't work.
Now, if I want to run the vanila benchmark by running the benchmark on PC, as it is, I get this:
Entering: ./work_dirs/modelartifacts/8bits/cl-3410_tvmdlr_imagenet1k_gluoncv-mxnet_mobilenetv2_1.0-symbol_json.tar.gz.link/srtifacts Not a directory
Which makes total sense, because the ./work_dirs/modelartifacts/8bits/cl-3410_tvmdlr_imagenet1k_gluoncv-mxnet_mobilenetv2_1.0-symbol_json.tar.gz.link
is a text file containing the URL to the ZIP file to be downloaded.
I think the code is not really in sync with what is expected of the documentation is not updated.
And then all the models are not loaded properly because they are not downloaded. :(
./work_dirs/modelartifacts/8bits/od-2010_tflitert_coco_mlperf_ssd_mobilenet_v2_300_float_tflite/model/ssd_mobilenet_v2_300_float.tflite': No such file or directory
Any ideas, @mathmanu ?
I just found it: the model have to be downloaded from the edgeai-modelzoo
using the proveded script, which was fixed here -> https://github.com/TexasInstruments/edgeai-modelzoo/commit/1bc9e1ae1cb4822c41cd82dda19bb5d6efcae7a8
I will now proceed with my experiments.
You don't need to download the models manually. You just need to clone the repository https://github.com/TexasInstruments/edgeai-modelzoo in the same folder where you have cloned edgeai-benchmark. edgeai-benchmark understands the .link files and it will automatically download the actuall files pointed by it.
Hi @mathmanu ,
I got all setup, built the wheel and got a Docker image to be able to run inference on custom models, etc. From a setup perspective, all looks good. However, when I try the code below it breaks:
onnx_session_options = rt.SessionOptions()
providers = ["TIDLExecutionProvider", "CPUExecutionProvider"]
onnx_session = rt.InferenceSession(str(onnx_model_path), providers=providers,
provider_options=[compile_options, {}], sess_options=onnx_session_options)
I have the LD_LIBRARY_PATH
setup and pointing to the tidl_tools
directory. All .so
files are there and I even executed a ldconfig
after setting the environment variable. But I still get this error:
Error - libtidl_onnxrt_EP.so: cannot open shared object file: No such file or directory
libtidl_onnxrt_EP loaded (nil)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
The libtidl_onnxrt_EP.so
is present, path is set, etc. But it doesn't work and I don't know why. :(
Any ideas / help?
There was an issue that was fixed yesterday, but your issue seems to be different - but just in case, pull the latest code, run setup.sh and try https://github.com/TexasInstruments/edgeai-benchmark/issues/6
@kumardesappan Do you have any suggestion?
May be from inside the python code, you can try to print LD_LIBRARY_PATH and TIDL_TOOLS_PATH print(os.environ['LD_LIBRARY_PATH']) print(os.environ['TIDL_TOOLS_PATH'])
Thanks for the super quick response, @mathmanu .
Here is my output:
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print(os.environ['TIDL_TOOLS_PATH'])
/home/helsing/trecs/edgeai-benchmark/tidl_tools
>>> print(os.environ['LD_LIBRARY_PATH'])
/home/helsing/trecs/edgeai-benchmark/tidl_tools
>>>
I will pull the latest code and run it again. Will keep you updated.
Some extra info that might help understanding what's going on:
FROM balenalib/aarch64-ubuntu:bionic
Here are the xtra .so
I have compiled:
(onnxrt) root@ee6e17a8a5e1:/code/onnxruntime# ls -larth /code/onnxruntime/build/Linux/MinSizeRel/*.so
-rwxr-xr-x 1 root root 8.3K Apr 26 16:01 /code/onnxruntime/build/Linux/MinSizeRel/libonnxruntime_providers_shared.so
-rwxr-xr-x 1 root root 400K Apr 26 17:03 /code/onnxruntime/build/Linux/MinSizeRel/libonnxruntime_providers_dnnl.so
lrwxrwxrwx 1 root root 23 Apr 26 21:18 /code/onnxruntime/build/Linux/MinSizeRel/libonnxruntime.so -> libonnxruntime.so.1.7.0
-rwxr-xr-x 1 root root 32K Apr 27 01:31 /code/onnxruntime/build/Linux/MinSizeRel/libcustom_op_library.so
-rwxr-xr-x 1 root root 9.9M Apr 27 01:37 /code/onnxruntime/build/Linux/MinSizeRel/onnxruntime_pybind11_state.so
Could that be the culprit, @mathmanu ?
I copied the libs under /usr/lib
and ran a ldconfig -v
. Below part of the output:
/usr/lib:
libann.so.0 -> libann.so.0.0.0
libtidl_tfl_delegate.so.1.0 -> libtidl_tfl_delegate.so (changed)
libtidl_onnxrt_EP.so.1.0 -> libtidl_onnxrt_EP.so (changed)
libvx_tidl_rt.so.1.0 -> libvx_tidl_rt.so.1.0
This should be fine. But still doesn't work. :(
Error - libtidl_onnxrt_EP.so: cannot open shared object file: No such file or directory
libtidl_onnxrt_EP loaded (nil)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
Sorry for all the comments, just trying to contribute somehow. :) I tried this, from the python REL:
>>> from ctypes.util import find_library
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP"))
>>> id(onnxrt)
365109887776
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x55023ec320>
So weird! I also copied the .so
files into the Conda env lib
directory.
I think this is the issue:
>>> from ctypes.util import find_library
>>> import ctypes
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP.so.1.0"))
>>> id(onnxrt)
365109889400
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x55023ec978>
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP"))
>>> id(onnxrt)
365113550944
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x550276a860>
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP.so"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/conda/envs/trecs/lib/python3.6/ctypes/__init__.py", line 426, in LoadLibrary
return self._dlltype(name)
File "/opt/conda/envs/trecs/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /opt/conda/envs/trecs/lib/libtidl_onnxrt_EP.so: cannot open shared object file: No such file or directory
>>>
Okay, last comment for today.
/usr/local/lib
and ran a ldconfig -v
. Here is the partial output:/usr/local/lib:
libtidl_tfl_delegate.so.1.0 -> libtidl_tfl_delegate.so
libvx_tidl_rt.so.1.0 -> libvx_tidl_rt.so.1.0
libtidl_onnxrt_EP.so.1.0 -> libtidl_onnxrt_EP.so
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes
>>> from ctypes.util import find_library
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP.so.1.0"))
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x55022e59b0>
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP"))
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x550266b828>
>>> onnxrt = ctypes.cdll.LoadLibrary(find_library("libtidl_onnxrt_EP.so"))
>>> onnxrt
<CDLL 'None', handle 55018431b0 at 0x55022e5780>
>>>
onnx_session_options = rt.SessionOptions()
providers = ["TIDLExecutionProvider", "CPUExecutionProvider"]
onnx_session = rt.InferenceSession(str(onnx_model_path), providers=providers,
provider_options=[compile_options, {}], sess_options=onnx_session_options)
Any clue?
I enabled debug by doing export LD_DEBUG=libs
and got some interesting output:
>>> from ctypes import _dlopen
>>> _dlopen("libtidl_onnxrt_EP.so")
465: find library=libtidl_onnxrt_EP.so [0]; searching
465: search path=/home/helsing/conda/envs/trecs/lib/python3.6/lib-dynload/../.. (RPATH from file /home/helsing/conda/envs/trecs/lib/python3.6/lib-dynload/readline.cpython-36m-aarch64-linux-gnu.so)
465: trying file=/home/helsing/conda/envs/trecs/lib/python3.6/lib-dynload/../../libtidl_onnxrt_EP.so
465: search path=/home/helsing/conda/envs/trecs/bin/../lib (RPATH from file python)
465: trying file=/home/helsing/conda/envs/trecs/bin/../lib/libtidl_onnxrt_EP.so
465: search path=/home/helsing/conda/envs/trecs/lib (LD_LIBRARY_PATH)
465: trying file=/home/helsing/conda/envs/trecs/lib/libtidl_onnxrt_EP.so
465: search cache=/etc/ld.so.cache
465: search path=/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu:/lib:/usr/lib (system search path)
465: trying file=/lib/aarch64-linux-gnu/libtidl_onnxrt_EP.so
465: trying file=/usr/lib/aarch64-linux-gnu/libtidl_onnxrt_EP.so
465: trying file=/lib/libtidl_onnxrt_EP.so
465: trying file=/usr/lib/libtidl_onnxrt_EP.so
465:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: libtidl_onnxrt_EP.so: cannot open shared object file: No such file or directory
>>>
The .so
file is present, as you can see in the snippet below:
(trecs) helsing@ce04838f9587:~/trecs$ ls -larth /home/helsing/conda/envs/trecs/lib/libtidl_onnxrt_EP.so
-rwxr-xr-x 1 helsing root 65K Apr 28 20:31 /home/helsing/conda/envs/trecs/lib/libtidl_onnxrt_EP.so
Will keep digging.
Hi @mathmanu,
I think I found the issue: the libtidl_onnxrt_EP.so
provided on the tidl_tools
released here -> https://github.com/TexasInstruments/edgeai-tidl-tools/releases/tag/08_02_00_01-rc1 are compiled for x86_64 only. I'm testing on the aarch platform.
When can I find those files compile for the right architecture? Do I have to do it myself?
The ARM libraries are in our SDK: https://www.ti.com/tool/download/PROCESSOR-SDK-LINUX-SK-TDA4VM https://www.ti.com/tool/SK-TDA4VM
We have tested the inference on EVMs/SoCs.
Does this answer your question?
Hi @mathmanu ,
Yeah, I found everything I need yesterday and also successfully executed inference with 2 models on the device we have.
I will proceed to convert our own model and get inference running on it.
Thanks for the support!
This issue can be closed, @mathmanu .
Hello,
I'm making a Dockerfile for this repo (based on the conda env). When I build the image from the Dockerfile (attached to this post), I get the following error messages for the wheel packages:
ERROR: tvm-0.8.dev0-cp36-cp36m-linux_x86_64.whl is not a supported wheel on this platform. ERROR: onnxruntime_tidl-1.7.0-cp36-cp36m-linux_x86_64.whl is not a supported wheel on this platform.
However, when I run the same installation commands from the setup.sh script for these wheel packages inside the Docker container, they are successfully installed without any errors.
How should I modify my Dockerfile so that these wheel packages will be installed when I build the Docker image?
Dockerfile.zip benchmark_env_v2.zip