Closed travisariggs closed 4 years ago
I have the exact same issue with the same system: Ubuntu 16.04.6 LTS 64-bit python 3.5.2
Hummn, I couldn't reproduce with python3.6 with tflite_runtime-1.14.0-cp36-cp36m-linux_x86_64.whl
Looks like the problem occurs during load_delegate('libedgetpu.so.1.0')
, could you confirm that it is installed?
For reference:
$ ls -l /usr/lib/x86_64-linux-gnu/libedgetpu*
lrwxrwxrwx 1 root root 43 Oct 9 14:37 /usr/lib/x86_64-linux-gnu/libedgetpu.so.1 -> /usr/lib/x86_64-linux-gnu/libedgetpu.so.1.0
-rwxr-xr-x 1 root root 930K Oct 9 14:37 /usr/lib/x86_64-linux-gnu/libedgetpu.so.1.0
Also, could you attach the output of strace
?
I have verified that libedgetpu.so is installed, I followed the last blog post on coral Here is the strace output. strace.txt
Thanks for your help!
@mrharicot are you sure you are getting the same error? From your strace.txt
, I'm seeing this:
File "classify_image.py", line 118, in <module>
main()
File "classify_image.py", line 95, in main
experimental_delegates=[load_delegate('libedgetpu.so.1.0')])
File "/usr/local/lib/python3.5/dist-packages/tflite_runtime/interpreter.py", line 230, in __init__
self._interpreter.ModifyGraphWithDelegate(
File "/usr/local/lib/python3.5/dist-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 97, in <lambda>
__getattr__ = lambda self, name: _swig_getattr(self, InterpreterWrapper, name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 74, in _swig_getattr
return _swig_getattr_nondynamic(self, class_type, name, 0)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 69, in _swig_getattr_nondynamic
return object.__getattr__(self, name)
AttributeError: type object 'object' has no attribute '__getattr__'
) = 1087
Which would actually be a different issue; in either case though, I cannot recreated even with python3.5 :(
My bad, this was without the non edge tpu model. Here is the strace for the edge tpu model: strace.txt
@mrharicot @travisariggs
Could you guys try the edgetpu
api instead of the tflite_runtime api?
https://github.com/google-coral/edgetpu/blob/master/examples/classify_image.py
At the moment I'd like to pin point the issue down to see if it's a tflite problem or libedgetpu problem
@Namburger Nice catch! The edgetpu api seems to be working fine for both the tflite and the edgetpu models.
python3 classify_image.py --model ../test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --label ~/code/coral/tflite/python/examples/models/inat_bird_labels.txt --image ~/code/coral/tflite/python/examples/images/parrot.jpg
---------------------------
Ara macao (Scarlet Macaw)
Score : 0.6484375
---------------------------
Platycercus elegans (Crimson Rosella)
Score : 0.13671875```
@mrharicot nice! So the issue looks to be with tflite_runtime
rather than the libedgetpu, weirdly I cannot recreate the issue even with python3.5 or python3.6 (although I tested on 2 different host machine, which makes me wonder if Ubuntu 16.04 is the culpit).
I'll file an internal bug on this one and will give you update if we can find something. Thanks for submitting the issue!
P.S. The difference between the 2 models *.tflite
and the *edgetpu.tflite
model is that one has been compiled for the TPU and it will be faster, while the other one will be slower since it uses CPU.
https://coral.withgoogle.com/docs/edgetpu/compiler/
For reference:
edgetpu_compiled
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
11.6ms
3.1ms
3.0ms
2.6ms
2.5ms
non-edgetpu compiled
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
146.0ms
144.2ms
146.1ms
147.2ms
145.1ms
-------RESULTS--------
According to your second trace file:
[pid 28150] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid 28150] open("/lib/x86_64-linux-gnu/libedgetpu.so.1.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
Can you please try the original example with the full path to libedgetpu.so
(please double check it exists):
load_delegate('/usr/lib/x86_64-linux-gnu/libedgetpu.so.1')
@dmitriykovalev I get the same error
I got the same issue with Ubuntu 19.04, python 3.7 and wheel package tflite_runtime-1.14.0-cp37-cp37m-linux_x86_64.whl .
The library was loaded correctly
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/libedgetpu.so.1", O_RDONLY|O_CLOEXEC) = 3
The edgetpu
API is working correctly, but tflite_runtime
not.
@mrazekv @mrharicot @travisariggs
I wonder if this would be a problem with tflite
api also. Could you try with with this script (this is just a modified version of the original script, using tflite
instead of tflite_runtime
):
import argparse
import time
import numpy as np
from PIL import Image
import tensorflow as tf
print("TF VERSION: ", tf.__version__)
def load_labels(filename):
with open(filename, 'r') as f:
return [line.strip() for line in f.readlines()]
def set_input_tensor(interpreter, image):
tensor_index = interpreter.get_input_details()[0]['index']
input_tensor = interpreter.tensor(tensor_index)()[0]
input_tensor[:, :] = image
def classify_image(interpreter, image, top_k):
set_input_tensor(interpreter, image)
interpreter.invoke()
output_details = interpreter.get_output_details()[0]
output = np.squeeze(interpreter.get_tensor(output_details['index']))
# If the model is quantized (uint8 data), then dequantize the results
if output_details['dtype'] == np.uint8:
scale, zero_point = output_details['quantization']
output = scale * (output - zero_point)
ordered_indices = output.argsort()[-top_k:][::-1]
return [(i, output[i]) for i in ordered_indices]
def main():
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument(
'--model', help='File path of .tflite file.', required=True)
parser.add_argument(
'--labels', help='File path of labels file.', required=True)
parser.add_argument('--image', help='Image to be classified.', required=True)
parser.add_argument(
'--top_k', help='Number of classifications to list', type=int, default=1)
parser.add_argument(
'--count', help='Number of times to run inference', type=int, default=5)
args = parser.parse_args()
print('Initializing TF Lite interpreter...')
interpreter = tf.compat.v2.lite.Interpreter(
model_path=args.model,
experimental_delegates=[tf.compat.v2.lite.experimental.load_delegate('libedgetpu.so.1.0')])
interpreter.allocate_tensors()
_, height, width, _ = interpreter.get_input_details()[0]['shape']
image = Image.open(args.image).resize((width, height), Image.ANTIALIAS)
print('----INFERENCE TIME----')
print('Note: The first inference on Edge TPU is slow because it includes',
'loading the model into Edge TPU memory.')
for _ in range(args.count):
start_time = time.monotonic()
results = classify_image(interpreter, image, args.top_k)
elapsed_ms = (time.monotonic() - start_time) * 1000
print('%.1fms' % elapsed_ms)
labels = load_labels(args.labels)
print('-------RESULTS--------')
for label_id, prob in results:
print('%s: %.5f' % (labels[label_id], prob))
if __name__ == '__main__':
main()
@Namburger thanks for a quick reply. I am using pre-build TF 1.14 (downloaded from the official release channel). However, the load_delegate
function was not found.
python3 test.py --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite --labels models/inat_bird_labels.txt --image images/parrot.jpg
TF VERSION: 1.14.0
Initializing TF Lite interpreter...
Traceback (most recent call last):
File "test.py", line 63, in <module>
main()
File "test.py", line 45, in main
experimental_delegates=[tf.compat.v2.lite.experimental.load_delegate('libedgetpu.so.1.0')])
AttributeError: module 'tensorflow._api.v1.compat.v2.lite' has no attribute 'experimental'
@mrazekv tensor flow and tflite_runtime versioning are a little different. I believe it required tfnightly v1.15 (tfnightly is the nightly build version of tensor flow) at minimum to get load_delegates. This is most likely why you don't have load_delegates with tf1.14. I'm actually using tf2.0 for this script, maybe try upgrading tf?
@Namburger I installed tf-nightly
package using pip3 in a virtual environment and your script is going fine.
TF VERSION: 2.1.0-dev20191024
Initializing TF Lite interpreter...
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
12.9ms
4.6ms
4.3ms
4.4ms
4.8ms
-------RESULTS--------
923 Ara macao (Scarlet Macaw): 0.76562
Thank you very much for your help.
@mrazekv No problems, thanks for helping me diagnose the issue. The only difference here is we're using tf's:
tf.compat.v2.lite.Interpreter
tf.compat.v2.lite.experimental.load_delegate
instead of
from tflite_runtime.interpreter import Interpreter
from tflite_runtime.interpreter import load_delegate
I believe tflite_runtime
is the issue, but we're unable to reproduce this on our end o_0
@mrazekv tensor flow and tflite_runtime versioning are a little different. I believe it required tfnightly v1.15 (tfnightly is the nightly build version of tensor flow) at minimum to get load_delegates. This is most likely why you don't have load_delegates with tf1.14. I'm actually using tf2.0 for this script, maybe try upgrading tf?
Any chance of getting the instructions at: https://www.tensorflow.org/lite/guide/python fixed? I just followed the instructions and hit this issue when setting up the Coral TPU uisng these instructions: https://coral.withgoogle.com/docs/accelerator/get-started/
Very frustrating!
How exaclty do I install tf-nightly or 1.15, whatever it takes to fix it?
I tried: sudo -H pip3 install tf-nightly (after without the sudo -H failed) and now get different errors :(
tflite/python/examples/classification$ python3 classify_image.py --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels models/inat_bird_labels.txt --input images/parrot.jpg 2019-11-21 12:43:46.933849: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory 2019-11-21 12:43:46.933866: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last): File "classify_image.py", line 118, in
main() File "classify_image.py", line 95, in main interpreter = make_interpreter(args.model) File "classify_image.py", line 69, in make_interpreter {'device': device[0]} if device else {}) File "/usr/local/lib/python3.5/dist-packages/tflite_runtime/interpreter.py", line 206, in init model_path)) NotImplementedError: Wrong number or type of arguments for overloaded function 'InterpreterWrapper_CreateWrapperCPPFromFile'. Possible C/C++ prototypes are: tflite::interpreter_wrapper::InterpreterWrapper::CreateWrapperCPPFromFile(char const ,std::vector< std::string > const &,std::string ) tflite::interpreter_wrapper::InterpreterWrapper::tflite_interpreter_wrapper_InterpreterWrapper_CreateWrapperCPPFromFile__SWIG_1(char const ,PyObject )
@wb666greene Can you share the output of:
python3 -c 'print(__import__("tensorflow").__version__)'
I just un-installed tf-nightly. But I get this:
$ python3 -c 'print(import("tensorflow").version)' Traceback (most recent call last): File "
", line 1, in AttributeError: module 'tensorflow' has no attribute 'version'
The instructions say to install: pip3 install tflite_runtime-1.14.0-cp35-cp35m-linux_x86_64.whl
If I open python3 interactive both import tflite_runtime and import tensorflow succeed, but neither seem to have a version:
tflite_runtime.version Traceback (most recent call last): File "
", line 1, in AttributeError: module 'tflite_runtime' has no attribute 'version'
I'm not planning to use the tflite runtime anytime soon, but I hate to leave an installation uncompleted.
A week ago I installed edgetpu_api_2.11.1.tar.gz to a different machine except for a sample I ran used run_inference (which seems to be a 2.11.2 thing) so I had to change it to RunInference to get it to work.
Starting fresh on a different machine using the https://coral.withgoogle.com/docs/accelerator/get-started/ instructions got me an apt Repo and apt-get installation of 2.11.2 and this mess trying to run the exampe at the bottom of the page.
My other sample (edgetpu_api) code runs fine on the new machine except for throwing tons of RunInterface is depreciated warnings ruining console output, but if I change back to run_inference all seems well.
What is the point of trivial name changes like this?
Words to live by that the computer industry seems hell-bent to ignore: "Different is not better, better is better. How is this change better?"
@wb666greene
So we've updated our runtime library, which has a tons of improvements. This is just normal software upgrades. However, it looks like that with the new runtime, tflite_runtime
has not catch up and that's what causing this issue. The weird thing about this is that I'm not able to reproduce this on my side. This is why I suggested above to:
a) either using the edgetpu API instead and the repo is provided here:
https://github.com/google-coral/edgetpu
b) use the full tensorflow lite API
This is documented here: https://coral.withgoogle.com/docs/edgetpu/tflite-python/
With this, you'll have to install tensorflow 1.15 and up. The normal process for doing this is just
pip3 install tensorflow==1.15 --user
It looks to me that you're having installation issues for tensorflow
Now I get the 1.15 version of tennsorflow but the sample code fails: `tflite/python/examples/classification$ python3 classify_image.py --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels models/inat_bird_labels.txt --input images/parrot.jpg Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tflite_runtime/interpreter.py", line 165, in load_delegate delegate = Delegate(library, options) File "/usr/local/lib/python3.5/dist-packages/tflite_runtime/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "classify_image.py", line 118, in
@wb666greene Using the tflite
API instead of tflite_runtime
, you'll have to change the script to what I mentioned above
I mentioned this to you also, but this tflite
API is documented here:
https://coral.withgoogle.com/docs/edgetpu/tflite-python/
P.S.
python3 test.py --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite --labels models/inat_bird_labels.txt --image images/parrot.jpg
TF VERSION: 1.15.0
Initializing TF Lite interpreter...
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
44.2ms
44.6ms
43.5ms
43.8ms
43.3ms
-------RESULTS--------
923 Ara macao (Scarlet Macaw): 0.77344
I cut and pasted your script and named it x.py, when I run it it errors:
tflite/python/examples/classification$ python3 x.py --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels models/inat_bird_labels.txt --image images/parrot.jpg TF VERSION: 1.15.0 Initializing TF Lite interpreter... Traceback (most recent call last): File "/home/wally/.local/lib/python3.5/site-packages/tensorflow_core/lite/python/interpreter.py", line 165, in load_delegate delegate = Delegate(library, options) File "/home/wally/.local/lib/python3.5/site-packages/tensorflow_core/lite/python/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "x.py", line 63, in
I think part of the problem is there are multiple versions of these example scripts floating around with the same name. I had to change --input (as in the webpage command figure 1) to --image on the command line for this.
If the problem is really something about libedgetpu.so.1.0 then the problem may be "upstream" in the sudo apt-get install libedgetpu1-max previous step.
My /usr/lib/x86_64-linux-gnu/libedgetpu.so.1.0 has modified date: Monday 16 Sep 2019 03:27:18 PM CDT
@wb666greene lol I had this issue before and I thought it was a tensorflow issue (turned out I didn't have my accelerator plugged in ¯_(ツ)_/¯ ). Other reasons are your user not in plugdev group or usb not detected on your system. Try this:
$ sudo usermod -aG plugdev [your username]
and reboot the system.
Other edgetpu_api TPU code is running fine with this system, so its not that. But I've certainly made this mistake multiple times in the past!
I managed to get the broken tflite_runtime removed and now your cut and pasted script runs:
tflite/python/examples/classification$ python3 x.py --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels models/inat_bird_labels.txt --image images/parrot.jpg TF VERSION: 1.15.0 Initializing TF Lite interpreter... ----INFERENCE TIME---- Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory. 12.9ms 3.6ms 3.4ms 3.5ms 3.4ms -------RESULTS-------- 923 Ara macao (Scarlet Macaw): 0.78516
But I can't figure out how to change the downloaded tflite/python/examples/classification program to use tensorflow instead of tflite_runtime (the one that needs --input instead of --image).
Thanks for your help.
@wb666greene woot woot! try replacing the classify_image.py from this repo with this: https://gist.github.com/Namburger/20788172fccf1ca0c9e13b7b14d1b70a
@Namburger Thanks, I merged the changes in your github with the downloaded example from the web-page instructions and it runs fine now.
I'd never would have figured out replacing: tflite.load_delegate() with: tf.compat.v2.lite.experimental.load_delegate()
How will I know if/when the tflite_runtime gets fixed for 2.11.2? or will it be when 2.11.3 comes out?
I have an "if it ain't broke don't fix it!" mentality, so I'd stayed with 1.92.2 until I tried Posenet which required 2.11.1 and then the problems started. I got Posenet working fine and then discovered that all my other TPU code was now broken.
Everything is back to working fine now.
@wb666greene my apologies, I understand, it's really hard to keep up with older references of the same library when the documentation continues to change. Our library is still involving and we're trying to expands the scope of operations that we can support as mentioned here which is one of the main reason why we're coming out with new releases.
We have an internal bug open to fix tflite_runtime
, but I cannot commit a time frame for when that will be done.
And tf.compat
is a tensorflow feature that allows you to change tensorflow behavior to be compatible with different versions. It is advised to check it out or be aware of when working with tensorflow, the doc is here: https://www.tensorflow.org/api_docs/python/tf/compat
Nothing to apologize for. You are trying to support two extremes, experts that have used tensorflow before the "edge" AI co-processors were available and newbies like myself who are jumping in to see what value existing "public" models can add to systems.
My specific interest at the moment is to use "person detection" to push the false notification (alarm) rate towards zero for existing video security camera systems.
Using the TPU and MobilenetSSD-v2 with a detect, zoom in and re-detect algorthim on an i7-4500U "Mini PC" (<60W), I'm getting one false positive about every 10 million frames. With 15 outdoor cameras and the AI processing ~40 fps (bit under 3 fps per camera) this is one false notification every two or three days. Main issues is they tend to come in bursts which make the false notifications even more annoying.
My idea was to use Posenet as the verification. Worked great, rejecting every single false positive image I'd collected. Problem is, it greatly increases the false negative rate by rejecting 30-90% of the valid detections depending on camera angle -- high downward looking angles reject the most.
In my case, the docker encountered ValueError: Didn't find custom op for name 'edgetpu-custom-op' with version 1 error.
make docker container with following options sudo docker run -d -it --privileged -v /dev:/dev -v /etc/udev:/etc/udev
in container, install tensorflow==2.0.0 because of "Didn't find custom op for name 'edgetpu-custom-op' with version 1 error."
pip install tensorflow==2.0.0
Finish !!! (^____^)
I am trying to run my first test of the USB Accelerator using the
classify_image.py
script and I'm getting an error trying to initialize the the Interpreter:I believe I have installed everything following this tutorial: https://coral.withgoogle.com/docs/accelerator/get-started/ including the installation of the runtime from https://www.tensorflow.org/lite/guide/python. I chose the
tflite_runtime-1.14.0-cp35-cp35m-linux_x86_64.whl
for my system.Here are some system details: Ubuntu 16.04.6 LTS 64-bit python 3.5.2