google-coral / libedgetpu

Source code for the userspace level runtime driver for Coral.ai devices.
Apache License 2.0
181 stars 62 forks source link

version of libedgetpu.so.1.0 #2

Closed jk78346 closed 4 years ago

jk78346 commented 4 years ago

Hi, I try to make from source and replace the libedgetpu.so.1.0 at /usr/lib/x86_64-linux-gnu/; however, it gives out the error:

ERROR: Internal: Unsupported data type in custom op handler: 0

when I was using C++ library to build interpreter. It seems relates to this issues. However, I just want to know if there is any way to check the runtime version? Maybe the libedgetpu.so.1.0 built from this repo is not compatible with the model I built.

I use edgetpu_compiler version: 2.1.302470888

And the output of "dpkg -l | grep edgetpu" is following:

ii  edgetpu                                     12.1-1                                          all          Edge TPU compiler and runtime
ii  edgetpu-compiler                            14.0                                            amd64        Edge TPU model compiler
rc  libedgetpu1-max:amd64                       13.0                                            amd64        Support library for Edge TPU
ii  libedgetpu1-std:amd64                       14.0                                            amd64        Support library for Edge TPU
ii  python3-edgetpu                             14.0                                            amd64        Edge TPU Python API

any suggestion?

Namburger commented 4 years ago

@jk78346 fyi: afaik, compiler version shouldn't make any difference in this situation.

Here are some guidance: 1) If you compile libedgetpu from source (this repo), please make a note of the TENSORFLOW_COMMIT in WORKSPACE, when you build libtensorflow-lite.a 2) If you are building the libedgetpu.so from source, then that's what you want to load during runtime, not what's in your deb package. So I would remove libedgetpu1-std:amd64 and libedgetpu1-max:amd64 from your system or set LD_LIBRARY_PATH to your new build just to ensure that's what the program is loading during runtime. You can also check this with strace. 3) Also important but make sure you don't forget to plug in your Accelerator :P (I spent way too much time debugging this issue yesterday) 4) You can check out this guide that I made: https://github.com/Namburger/edgetpu-minimal-example/tree/custom-tensorflow-build

jk78346 commented 4 years ago

Thanks for quick reply. I was building the libedgetpu.so file and I'm sure that /usr/lib/x86_64-linux-gnu/ is the location where the libedgetpu.so was made from edgeptu/scripts/runtime/install.sh. So instead I export LD_LIBRARY_PATH with the path/libedgetpu/out/direct/k8/libedgetpu.so.1.0

The situation is: the former case is successful (I do have accelerator plugged =) ) while the later one gives out the error message: ERROR: Internal: Unsupported data type in custom op handler: 0

I guess my method pretty much rule out other possible reasons why it doesn't work other than just the difference between .so installed by (1)https://github.com/google-coral/edgetpu/blob/master/scripts/runtime/install.sh or (2)libedgetpu/out/direct/k8/libedgetpu.so.1.0

So maybe the question could be: are those two .so the same file? I check the file size (listed below) and it's not exactly the same. (1)is 917K (2)is 941K

jk78346 commented 4 years ago

BTW, I tried the guideline one you mention for .a file and it works for .so too ! So I think the reason is just the TENSORFLOW_COMMIT and TENSORFLOW_SHA256 have to be the same as the ones specified in https://github.com/google-coral/edgetpu/blob/master/WORKSPACE

Namburger commented 4 years ago

@jk78346 sorry, I'm a little confused now. So was the problem due to libtensorflow-lite.a not build from the same version as libedgetpu.so? To answer your questions, the 2 files are not the same. The one from here contains runtime version 14, while the one on here will source for unreleased "master" version. So when you build from here, take note of that commit, when you build tensorflow lite from here, make sure to checkout that commit from tensorflow. This is all being handled on my CMakeLists.txt file.

Cheers!

jk78346 commented 4 years ago

I guess so. I just modify the TENSORFLOW_COMMIT and TENSORFLOW_SHA256 as :

TENSORFLOW_COMMIT = "d855adfc5a0195788bf5f92c3c7352e638aa1109";
TENSORFLOW_SHA256 = "b8a691dbea2bb028fa8f7ce407b70ad236dae0a8705c8010dc7bad8af7e93bac"