Closed aeon0 closed 3 years ago
@j-o-d-o The dev board mini interface with the tpu over usb2 which will be much slower than usb3 which is the port on the dev board, so this is sort of expected. How are you printing out the tpu type on the coral dev board? It should looks like this:
>>> from pycoral.utils.edgetpu import list_edge_tpus
>>> list_edge_tpus()
[{'type': 'pci', 'path': '/dev/apex_0'}]
I am using the C++ interface to print it. But I ran your python code on the devboard mini and get this:
>>> from pycoral.utils.edgetpu import list_edge_tpus
>>> list_edge_tpus()
[{'type': 'usb', 'path': '/sys/bus/usb/devices/2-1'}]
I am a bit confused why the integrated TPU on the dev board would show up as "usb".
@j-o-d-o yes, that's expected for the dev board mini, at some point you asked about the "Coral Dev Board" which got me a little confused :)
Ah ok. Sorry, didnt know there was a difference in terms of TPU interface. Yes, the issue is with the DevBoard Mini. Can USB-2 vs USB-3 really account for such a big runtime difference (around 25ms vs 200ms)?
@j-o-d-o yes it could, unfortunately, and as a matter of fact, if you have a usb-c -> usb3 dongle and attach a usb accelerator on the dev board mini, I expect that the external tpu would be faster than the internal tpu :/
Hmm, that is unforunate and not really what I expected. But alright. I think the next step will be to try to get the USB Accelerator connected to the DevBoard Mini and test it. Just to make sure I am not missing something else. Closing this for now. Thanks for the help.
@Namburger I finally managed to get the USB Accelerator connected to the DevBoard Mini and I see the same "bad" performance also with the USB Accelerator. Not sure if I should be happy or sad about it :D I think I go with happy because it means I can fix this somehow. Sorry for the false alarm before properly testing it...
@j-o-d-o so sorry, I gave you incorrect info :( The usb-c on the mini also only supports usb2: https://coral.ai/products/dev-board-mini/#tech-specs
@Namburger It really seems like the usb2 is the bottleneck here. I tested with a model: Input: 320x128x3 Conv2D layer with 1x1 kernel and 16 channels Output: 320x128x16
Runtime with USB3 and Stick: ~8ms Runtime on DevBoard Mini: ~115ms
Bummer. Did not expect that, but ok, will have to work with that for now. I am looking forward to your next-gen products. Hope they will hit the shelves in 2021. Any heads up appreciated :)
And thanks for the good and quick support here in the github issues!
It is been a while. But I can now finally say for sure, yes, it is the USB 2 vs USB 3 issue. I am now using a "normal" Coral DevBoard and it has the same (slightly faster actually) speed as with the coral usb stick. So lessons learned: For real time img processing tasks the Mini is probably the wrong piece of hardware, rather go with the regular DevBoard.
When reading the Datasheet, both should have the edge TPU available. But when I run my code with the USB Accelerator I have around 5x faster inference speed vs the DevBoard Mini. I exclusively measured the call
interpreter->Invoke();
to evaluate the runtime and the edgetpu is found in both cases.It is the same code and same model. The only difference is that used edgtpu lib file.
k8/libedgetpu.so
for the USB Accelerator andaarch64/libedgetpu.so
on the DevBoard. I am using the latest edgetpu lib and the requiered tflite build.Any idea what could cause this huge performance difference?
When printing the TPU type I get this for the Coral Dev Board:
Is this normal that it is also a USB device on the dev board?
Also here is the inference code: https://github.com/j-o-d-o/SomeSense-App/blob/master/components/algo/inference/inference.cpp
The log shows that everything should be mapped to the EdgeTpu, so that should also not make any difference model_quant_edgetpu.log model_quant_edgetpu.tflite.zip