Closed stanleyoz closed 1 year ago
Could you please share the logs for the below commands:
thanks super 911 reponse mate ....
(py38_venv) root@nodeG5:~/py38# python3 -m pip install --extra-index-url https://google-coral.github.io/py-repo/ pycoral~=2.0
Looking in indexes: https://pypi.org/simple, https://google-coral.github.io/py-repo/
Collecting pycoral~=2.0
Downloading https://github.com/google-coral/pycoral/releases/download/v2.0.0/pycoral-2.0.0-cp38-cp38-linux_aarch64.whl (352 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 352.7/352.7 kB 1.9 MB/s eta 0:00:00
Collecting Pillow>=4.0.0 (from pycoral~=2.0)
Downloading Pillow-9.5.0-cp38-cp38-manylinux_2_28_aarch64.whl (3.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 1.7 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.16.0 in ./py38_venv/lib/python3.8/site-packages (from pycoral~=2.0) (1.24.3)
Collecting tflite-runtime==2.5.0.post1 (from pycoral~=2.0)
Downloading https://github.com/google-coral/pycoral/releases/download/v2.0.0/tflite_runtime-2.5.0.post1-cp38-cp38-linux_aarch64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 2.3 MB/s eta 0:00:00
Installing collected packages: tflite-runtime, Pillow, pycoral
Attempting uninstall: pycoral
Found existing installation: pycoral 0.1.0
Uninstalling pycoral-0.1.0:
Successfully uninstalled pycoral-0.1.0
Successfully installed Pillow-9.5.0 pycoral-2.0.0 tflite-runtime-2.5.0.post1
(py38_venv) root@nodeG5:~/py38# python3 -c "import tflite_runtime as tflite; print('tflite runtime vesrion:', tflite.version);import pycoral; print('pycoral version:', pycoral.version)"
Traceback (most recent call last):
File "
Installing collected packages: tflite-runtime, Pillow, pycoral Attempting uninstall: pycoral Found existing installation: pycoral 0.1.0 Uninstalling pycoral-0.1.0: Successfully uninstalled pycoral-0.1.0 Successfully installed Pillow-9.5.0 pycoral-2.0.0 tflite-runtime-2.5.0.post1
Now, you have the correct versions of pycoral and tflite runtime.. Does your script working fine now?
python3.10 -c "import tflite_runtime as tflite; print('tflite runtime vesrion:', tflite.version);import pycoral; print('pycoral version:', pycoral.version)"
Wow! We gone past the pycoral.utils hurdle BUT, we now have
ValueError: Failed to load delegate from libedgetpu.so.1
We did the "Setup Device" part and
root@nodeG5:~# lspci
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01)
01:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU
please install the edgetpu runtime
I believe we have done that earlier, retry
ValueError: Failed to load delegate from libedgetpu.so.1
(py38_venv) root@nodeG5:/tf# echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update
deb https://packages.cloud.google.com/apt coral-edgetpu-stable main
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
100 1210 100 1210 0 0 2737 0 --:--:-- --:--:-- --:--:-- 2743
OK
Hit:1 http://security.debian.org/debian-security bullseye-security InRelease
Hit:2 http://deb.debian.org/debian bullseye InRelease
Hit:3 https://packages.microsoft.com/debian/11/prod bullseye InRelease
Hit:4 https://packages.cloud.google.com/apt coral-cloud-stable InRelease
Hit:5 https://packages.cloud.google.com/apt coral-edgetpu-stable InRelease
Hit:6 http://deb.debian.org/debian bullseye-updates InRelease
Hit:7 http://deb.debian.org/debian bullseye-backports InRelease
Hit:8 https://deb.nodesource.com/node_12.x bullseye InRelease
Reading package lists... Done
(py38_venv) root@nodeG5:/tf# sudo apt-get install libedgetpu1-std
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libedgetpu1-std is already the newest version (16.0).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
TEST SCRIPT still reports that so is missing :(
(py38_venv) root@nodeG5:/tf# python predict_tpu.py Traceback (most recent call last): File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 160, in load_delegate delegate = Delegate(library, options) File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "predict_tpu.py", line 14, in
File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 162, in load_delegate
raise ValueError('Failed to load delegate from {}\n{}'.format(
ValueError: Failed to load delegate from libedgetpu.so.1
please add the below lines at import section in your test script and share the logs:
from pycoral.pybind._pywrap_coral import SetVerbosity as set_verbosity
set_verbosity(10)
Traceback (most recent call last): File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 160, in load_delegate delegate = Delegate(library, options) File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "predict_tpu.py", line 14, in
File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 162, in load_delegate
raise ValueError('Failed to load delegate from {}\n{}'.format(
ValueError: Failed to load delegate from libedgetpu.so.1
(py38_venv) root@nodeG5:/tf# nano predict_tpu.py (py38_venv) root@nodeG5:/tf# nano predict_tpu.py (py38_venv) root@nodeG5:/tf# python predict_tpu.py I tflite/edgetpu_manager_direct.cc:453] No matching device is already opened for shared ownership. I driver/driver_factory_default.cc:31] Failed to open /sys/class/apex: No such file or directory I driver/usb/local_usb_device.cc:944] EnumerateDevices: vendor:0x1a6e, product:0x89a I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[4] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[3] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[1] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[2] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[1] port[0] I driver/usb/local_usb_device.cc:944] EnumerateDevices: vendor:0x18d1, product:0x9302 I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[4] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[3] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[1] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[2] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[1] port[0] I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (PCIe) is available. I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (USB) is available. I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (Reference) is available. I tflite/edgetpu_manager_direct.cc:502] Failed allocating Edge TPU device for shared ownership. Traceback (most recent call last): File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 160, in load_delegate delegate = Delegate(library, options) File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "predict_tpu.py", line 17, in
File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 162, in load_delegate
raise ValueError('Failed to load delegate from {}\n{}'.format(
ValueError: Failed to load delegate from libedgetpu.so.1
please share your host machine details? Have you installed the gasket-dkms package?
Please share the gasket-dkms installation logs and output of below commands:
sudo dmesg |grep apex
sudo lspci -vvv | grep MSI-X
sudo lspci -vvv
Did run all the installations,
root@nodeG5:~# sudo apt-get install gasket-dkms libedgetpu1-std Reading package lists... Done Building dependency tree... Done Reading state information... Done gasket-dkms is already the newest version (1.0-18). libedgetpu1-std is already the newest version (16.0). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Logs ...
root@nodeG5:~# sudo dmesg |grep apex
root@nodeG5:~# sudo lspci -vvv | grep MSI-X
Capabilities: [d0] MSI-X: Enable- Count=128 Masked-
root@nodeG5:~# sudo lspci -vvv
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
01:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU (prog-if ff)
Subsystem: Global Unichip Corp. Coral Edge TPU
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
Its a IMX8 single board computer running Debian 11, vnev is Python 3.8.12 and base is 3.9.x
root@nodeG5:~# sudo dmesg |grep apex~
Issue1: apex driver is not loading.
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [d0] MSI-X: Enable- Count=128 Masked-
Issue 2: MSI-X is not ableed for your hardware. https://www.kernel.org/doc/html/latest/PCI/msi-howto.html#:~:text=Using%20'lspci%20%2Dv'%20(,%E2%80%9C%2D%E2%80%9D%20(disabled).
Thanks mate. Before I go try hunt down the issue (1) and (2), as they look pretty hard to replicate in production boards ...
Is this script ok?
(py38_venv) root@nodeG5:/tf# cat predict_tpu.py
import time import pycoral as pycoral from pycoral.utils import edgetpu
import tflite_runtime.interpreter as tflite from tensorflow import keras from pycoral.pybind._pywrap_coral import SetVerbosity as set_verbosity set_verbosity(10)
interpreter = tflite.Interpreter(model_path='/tf/inspect.tflite', experimental_delegates=[tflite.load_delegate('libedgetpu.so.1')]) ....
Output was ...
(py38_venv) root@nodeG5:/tf# python predict_tpu.py I tflite/edgetpu_manager_direct.cc:453] No matching device is already opened for shared ownership. I driver/driver_factory_default.cc:31] Failed to open /sys/class/apex: No such file or directory I driver/usb/local_usb_device.cc:944] EnumerateDevices: vendor:0x1a6e, product:0x89a I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[4] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[3] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[1] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[2] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[1] port[0] I driver/usb/local_usb_device.cc:944] EnumerateDevices: vendor:0x18d1, product:0x9302 I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[4] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[3] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[1] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[3] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[2] port[0] I driver/usb/local_usb_device.cc:979] EnumerateDevices: checking bus[1] port[0] I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (PCIe) is available. I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (USB) is available. I tflite/edgetpu_manager_direct.cc:471] No device of type Apex (Reference) is available. I tflite/edgetpu_manager_direct.cc:502] Failed allocating Edge TPU device for shared ownership. Traceback (most recent call last): File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 160, in load_delegate delegate = Delegate(library, options) File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 119, in init raise ValueError(capture.message) ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "predict_tpu.py", line 17, in
File "/root/py38/py38_venv/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 162, in load_delegate
raise ValueError('Failed to load delegate from {}\n{}'.format(
ValueError: Failed to load delegate from libedgetpu.so.1
Also I missed these logs you required last night
(py38_venv) root@nodeG5:/tf# sudo dmesg |grep apex
(py38_venv) root@nodeG5:/tf#
(py38_venv) root@nodeG5:/tf# sudo lspci -vvv | grep MSI-X
Capabilities: [d0] MSI-X: Enable- Count=128 Masked-
(py38_venv) root@nodeG5:/tf#
(py38_venv) root@nodeG5:/tf# sudo lspci -vvv
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
01:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU (prog-if ff)
Subsystem: Global Unichip Corp. Coral Edge TPU
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
Also
Linux version 5.15.32+g613bd464a2ed (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.38.20220313) #1 SMP PREEMPT Tue Jun 7 02:34:46 UTC 2022
Debian 11 Python 3.8.12 (inside venv)
Thanks
PLeae check the sample script to run the inference at: https://github.com/hjonnala/snippets/blob/main/coral_inference.py
Before I go try hunt down the issue (1) and (2), as they look pretty hard to replicate in production boards ...
Are you having issue only with some boards?
For apex driver loading issue, please check if secure boot is enabled. If its so, please try disabling it. If apex loaded properly you should see similar output as mentioned at step6: https://coral.ai/docs/m2/get-started/#2a-on-linux
We are still stuck at MSI-X disabled and therefore apex_0 driver not being able to be installed .. waiting for some answer from either NXP or our board OEM in Israel
Seems it's holiday week in Israel :( .. anyway, I discovered something, when an Intel WIFI/BT card was installed in the same M.2 PCIe connector, its MSI-X was enabled. I hope its not too tricky for our OEM firmware engineer to sort this out for us.
So, this MSI-X is likely the problem blocking the installation of the apex_0 driver?
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 236 Memory at 18000000 (32-bit, non-prefetchable) [size=1M] Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: 18100000-181fffff [size=1M] Prefetchable memory behind bridge: [disabled] Expansion ROM at 18200000 [virtual] [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit- Capabilities: [70] Express Root Port (Slot-), MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Secondary PCI Express Capabilities: [158] L1 PM Substates Kernel driver in use: pcieport
With Intel AX210 WIFI/BT card
01:00.0 Network controller: Intel Corporation Device 2725 (rev 1a) Subsystem: Intel Corporation Device 0024 Flags: bus master, fast devsel, latency 0, IRQ 235 Memory at 18100000 (64-bit, non-prefetchable) [size=16K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [40] Express Endpoint, MSI 00 Capabilities: [80] MSI-X: Enable+ Count=16 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [14c] Latency Tolerance Reporting Capabilities: [154] L1 PM Substates Kernel driver in use: iwlwifi Kernel modules: iwlwifi
root@nodeG5:~# lspci -v 00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode]) Flags: fast devsel, IRQ 236 Memory at 18000000 (32-bit, non-prefetchable) [disabled] [size=1M] Bus: primary=00, secondary=00, subordinate=00, sec-latency=0 I/O behind bridge: 00000000-00000fff [size=4K] Memory behind bridge: 00000000-000fffff [size=1M] Prefetchable memory behind bridge: 00000000-000fffff [size=1M] Expansion ROM at 18200000 [virtual] [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit- Capabilities: [70] Express Root Port (Slot-), MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Secondary PCI Express Capabilities: [158] L1 PM Substates Kernel driver in use: pcieport
with EDGE TPU Card ...
01:00.0 Network controller: Intel Corporation Device 2725 (prog-if ff)
Subsystem: Global Unichip Corp. Device 089a
Flags: fast devsel, IRQ 235
Memory at 18100000 (64-bit, non-prefetchable) [virtual] [size=16K]
Memory at
So, this MSI-X is likely the problem blocking the installation of the apex_0 driver?
I think, apex is already installed but its not loading. Can you please share the output of below commands:
- groups $USER
- sudo modinfo apex
root@nodeG5:~# groups $USER root : root apex root@nodeG5:~# modinfo apex modinfo: ERROR: Module apex not found.
Please uninstall gasket-dkms package and share the installtion logs..
root@nodeG5:~# apt install gasket-dkms Reading package lists... Done Building dependency tree... Done Reading state information... Done The following package was automatically installed and is no longer required: python3-tflite-runtime Use 'apt autoremove' to remove it. The following NEW packages will be installed: gasket-dkms 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 0 B/48.0 kB of archives. After this operation, 256 kB of additional disk space will be used. perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_AU.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory Selecting previously unselected package gasket-dkms. (Reading database ... 81362 files and directories currently installed.) Preparing to unpack .../gasket-dkms_1.0-18_all.deb ... Unpacking gasket-dkms (1.0-18) ... Setting up gasket-dkms (1.0-18) ... locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory Loading new gasket-1.0 DKMS files... It is likely that 5.15.32+g613bd464a2ed belongs to a chroot's host Building for 5.10.0-21-arm64 and 5.15.32+g613bd464a2ed Building initial module for 5.10.0-21-arm64 Done.
gasket.ko: Running module version sanity check. Error! Module version 1.1.4 for gasket.ko is not newer than what is already found in kernel 5.10.0-21-arm64 (1.2). You may override by specifying --force.
apex.ko: Running module version sanity check.
depmod...
DKMS: install completed. Module build for kernel 5.15.32+g613bd464a2ed was skipped since the kernel headers for this kernel does not seem to be installed.
Running module version sanity check. Error! Module version 1.1.4 for gasket.ko is not newer than what is already found in kernel 5.10.0-21-arm64 (1.2).
It's not installed properly. Please remove it and try sudo apt-get install gasket-dkms
. The logs looks like this: https://github.com/google-coral/edgetpu/issues/723#issuecomment-1428861577
Seems the same ..
root@nodeG5:~# apt-get install gasket-dkms Reading package lists... Done Building dependency tree... Done Reading state information... Done The following package was automatically installed and is no longer required: python3-tflite-runtime Use 'apt autoremove' to remove it. The following NEW packages will be installed: gasket-dkms 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 0 B/48.0 kB of archives. After this operation, 256 kB of additional disk space will be used. perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_AU.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory Selecting previously unselected package gasket-dkms. (Reading database ... 81362 files and directories currently installed.) Preparing to unpack .../gasket-dkms_1.0-18_all.deb ... Unpacking gasket-dkms (1.0-18) ... Setting up gasket-dkms (1.0-18) ... locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory Loading new gasket-1.0 DKMS files... It is likely that 5.15.32+g613bd464a2ed belongs to a chroot's host Building for 5.10.0-21-arm64 and 5.15.32+g613bd464a2ed Building initial module for 5.10.0-21-arm64 Done.
gasket.ko: Running module version sanity check. Error! Module version 1.1.4 for gasket.ko is not newer than what is already found in kernel 5.10.0-21-arm64 (1.2). You may override by specifying --force.
apex.ko: Running module version sanity check.
depmod...
DKMS: install completed. Module build for kernel 5.15.32+g613bd464a2ed was skipped since the kernel headers for this kernel does not seem to be installed. root@nodeG5:~#
In any case, U have asked the firmware engineers at our OEM Compulab in Israel to have a look (esp. the MSI-X) and they are waiting for the M.2 Edge TPU to arrive late next week. On my side, my MSI-X is always Enable- (not enabled).
Module build for kernel 5.15.32+g613bd464a2ed was skipped since the kernel headers for this kernel does not seem to be installed.
please install the kernel headers and uninstall gasket-dkms and install it again..
I think because the kernel is custom by Compulab, it's not in the repository.
root@nodeG5:~# sudo apt update
Hit:1 http://security.debian.org/debian-security bullseye-security InRelease
Get:2 https://packages.microsoft.com/debian/11/prod bullseye InRelease [3629 B]
Hit:3 http://deb.debian.org/debian bullseye InRelease
Hit:4 https://packages.cloud.google.com/apt coral-cloud-stable InRelease
Get:5 https://packages.microsoft.com/debian/11/prod bullseye/main all Packages [1028 B]
Hit:6 https://packages.cloud.google.com/apt coral-edgetpu-stable InRelease
Get:7 https://packages.microsoft.com/debian/11/prod bullseye/main amd64 Packages [83.3 kB]
Get:8 http://deb.debian.org/debian bullseye-updates InRelease [44.1 kB]
Get:9 http://deb.debian.org/debian bullseye-backports InRelease [49.0 kB]
Get:10 https://packages.microsoft.com/debian/11/prod bullseye/main arm64 Packages [14.6 kB]
Get:11 https://packages.microsoft.com/debian/11/prod bullseye/main armhf Packages [13.4 kB]
Hit:12 https://deb.nodesource.com/node_12.x bullseye InRelease
Fetched 209 kB in 3s (81.5 kB/s)
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
All packages are up to date.
root@nodeG5:~# sudo apt install linux-headers-$(uname -r)
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-headers-5.15.32+g613bd464a2ed
E: Couldn't find any package by glob 'linux-headers
OK, pleae try building the package from source and install it: https://github.com/google/gasket-driver
Hi mate! Thanks to hard work of Team Benjamin at Compulabs to mod the kernel of our IMX8PLUS board firmware, we managed to install the required drivers! Thanks for your help to ID the MSI-X issue :)
root@iot-gate-imx8plus:/coral/pycoral# python3 examples/classify_image.py \ --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \ --labels test_data/inat_bird_labels.txt \ --input test_data/parrot.jpg ----INFERENCE TIME---- Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory. 14.0ms 3.7ms 3.9ms 4.0ms 4.0ms -------RESULTS-------- Ara macao (Scarlet Macaw): 0.75781
Now we carry on to port our work files over to our nodeG5 gateway and push ahead to realise the customer's requirements
Hi. OK, parrot example works but when we mod our tensorflow lite script as instructed, we get ..
interpreter = tflite.Interpreter(model_path="mnist.tflite", expertimental_delegates=[tflite.load_delegate('libedgetpu.so.1')])
Traceback (most recent call last):
File "/home/compulab/tf/gpt_tflite_MNIST_TPU.py", line 9, in
compulab@iot-gate-imx8plus:~/tf$ sudo find / -name "libedgetpu.so.1" [rw,errors=remount-ro] /usr/lib/aarch64-linux-gnu/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/throttled/aarch64/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/throttled/armv7a/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/throttled/k8/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/direct/aarch64/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/direct/armv7a/libedgetpu.so.1 /coral/pycoral/libedgetpu_bin/direct/k8/libedgetpu.so.1
OK, when I ran the script in root, it worked but for my simple single digit PNG file inference against trained MNIST database, the TPU invoke() took longer than the tensorflow CPU (IMX8) process, e.g. 2.4msec (TPU) vs. 0.97msec (CPU). Keen to investigate? We wanted a simple benchmark test to show that using the TPU accelerator "option" of our gateway on a popular demo will convince some developers to try the TPU version. I attach a link to the files if you got time to check why TPU is slower to inference. https://drive.google.com/drive/folders/1nAgx5kbogx4Li-fQBCbo8x2rtnwQFg7P?usp=sharing Thanks again mate.
OK, when I ran the script in root, it worked but for my simple single digit PNG file inference against trained MNIST database, the TPU invoke() took longer than the tensorflow CPU (IMX8) process, e.g. 2.4msec (TPU) vs. 0.97msec (CPU). Keen to investigate? We wanted a simple benchmark test to show that using the TPU accelerator "option" of our gateway on a popular demo will convince some developers to try the TPU version. I attach a link to the files if you got time to check why TPU is slower to inference. https://drive.google.com/drive/folders/1nAgx5kbogx4Li-fQBCbo8x2rtnwQFg7P?usp=sharing Thanks again mate.
Please create a new issue for this if you need any further help, as it is not relevant for this thread. The issue is none of the operations are running on TPU. Please go through the below links and fix the tflite conversion issue. Thanks!!
Description
We have tested working industrial safety monitoring solution on a RPi and now need to port our TFlite model and inference solution to an industrial IMX8 gateway. By chasing the various 'hints' on Stack Overflow etc, we are now stuck at a Pycoral library error ...
(py38_venv) root@nodeG5:/tf# python predict_tpu.py Traceback (most recent call last): File "predict_tpu.py", line 4, in
from pycoral.utils import edgetpu
ModuleNotFoundError: No module named 'pycoral.utils'
Python 3.8.12 GCC 10.2.1 Debian 11 on I.MX8
We have also tried ... sudo apt-get update sudo apt-get install python3-pycoral
This is a potential large scale installation (if we can order enough EDGE TPUs! hhh) but we are really short of time to get this error ironed out and install a few units onsite (Lithium mine site). Please advise team TPU.
Click to expand!
### Issue Type Build/Install ### Operating System Linux ### Coral Device _No response_ ### Other Devices _No response_ ### Programming Language Python 3.8 ### Relevant Log Output _No response_