google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
422 stars 124 forks source link

TPU is no longer available - '/dev/apex_0': No such file or directory #820

Open luxapana opened 8 months ago

luxapana commented 8 months ago

Description

I installed tpu driver in to a fresh ubuntu installation by following instruction here: https://coral.ai/docs/m2/get-started/#2a-on-linux

Everything worked as expected and then used the TPU with Frigate CCTV for detection tasks with zero issues for few days. Today, with no apparent reason Frigate started to complain. Figured out that the command 'ls /dev/apex_0' returns ls: cannot access '/dev/apex_0': No such file or directory.

Outputs of few more commands are given below.

during this time I did not do any changes to the system, rather just observing how frigate does object detection with TPU - which worked perfectly. Only thing that happened may be that laptop abruptly shutdown due to a power failure (This has no battery).

Both gasket-dkms and libedgetpu1-std are already installed. appex group exists and my user is part of that group.

Appreciate some help to diagnose this further.

Thank you.

uname -a out: Linux cctvserverlap 6.5.0-14-generic #14~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov 20 18:15:30 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

lspci -nn | grep 089a: 12:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]

modinfo gasket: modinfo: ERROR: Module gasket not found.

modinfo apex: modinfo: ERROR: Module apex not found.

Click to expand! ### Issue Type Bug ### Operating System Ubuntu ### Coral Device Mini PCIe ### Other Devices _No response_ ### Programming Language _No response_ ### Relevant Log Output _No response_
rbnswartz commented 8 months ago

I've run into the same issue. Maybe a problem with a recent kernel update?

libussa commented 8 months ago

definitely a kernel version issue. I had the issue after upgrading to 6.5.0-14, went back to 6.2.0-39 and it's working fine

keptin commented 8 months ago

I'm also in this boat. Ubuntu 22.04 w/ Frigate NVR running in docker-compose flawlessly for over a year and suddenly throwing errors related to the TPU not being found at pcie:0 in the config as of a week ago maybe.

Using the grub menu to boot from 6.2.0-39 did not fix the issue. ls /dev/apex_0 is still returning a no directory error. lspci -nn | grep 089a: returns the device is detected.

Anything else I can try?

feranick commented 8 months ago

Installing gasket-dkms fails on kernel 6.5, which appears no longer supported. Log of the apt install gasket-dkms attached.

log.txt

feranick commented 8 months ago

Basically, edgeTPU is no longer supported, with outdated drivers.

feranick commented 8 months ago

Actually, gasket-dkms is open-source and support for kernel 6.4+ has been added. One can recompile gasket-dkms from the source below.

https://github.com/google/gasket-driver

keptin commented 8 months ago

I was in this same boat, with a Coral TPU running Frigate NVR on Ubuntu 22.04.

This thread helped me solve it: https://github.com/google-coral/edgetpu/issues/808

luxapana commented 8 months ago

These instructions worked for me to rebuild the driver: https://github.com/google-coral/edgetpu/issues/808#issuecomment-1909019568