google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
429 stars 125 forks source link

Error in device opening /dev/apex_0 on kernel 6.6.28 #846

Open luchmedia opened 6 months ago

luchmedia commented 6 months ago

Description

Upgraded my Rasbperry Pi 5 to kernelLinux raspberrypi 6.6.28+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.28-1+rpt1 (2024-04-22) aarch64 GNU/Linux. Trying to launch an edgetpu example in docker accessing /dev/apex_0 device causes problems. Build gasket-dkms as in #808, worked fine with 6.5 kernel.

run container sudo docker run -it --device /dev/apex_0:/dev/apex_0 coral /bin/bash

run test in container python3 /usr/share/edgetpu/examples/classify_image.py --model /usr/share/edgetpu/examples/models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --label /usr/share/edgetpu/examples/models/inat_bird_labels.txt --image /usr/share/edgetpu/examples/images/bird.bmp

crash

Traceback (most recent call last): File "/usr/share/edgetpu/examples/classify_image.py", line 54, in main() File "/usr/share/edgetpu/examples/classify_image.py", line 44, in main engine = ClassificationEngine(args.model) File "/usr/lib/python3/dist-packages/edgetpu/classification/engine.py", line 48, in init super().init(model_path) File "/usr/lib/python3/dist-packages/edgetpu/basic/basic_engine.py", line 92, in init self._engine = BasicEnginePythonWrapper.CreateFromFile(model_path) RuntimeError: Error in device opening (/dev/apex_0)!

also tested with Frigate NVR and /dev/apex_0 can't be reached also if device is mapped.

version: "3"
services:
  frigate:
    container_name: frigate
    privileged: true
    restart: unless-stopped
    image: ghcr.io/blakeblackshear/frigate:stable
    shm_size: "256mb"
    devices:
      - /dev/apex_0:/dev/apex_0
      - /dev/video19:/dev/video19
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /home/pi/frigate-nvr/config.yml:/config/config.yml
      - /media/Hdd/recordings:/media/frigate
      - type: tmpfs
        target: /tmp/cache
        tmpfs:
          size: 1500000000
    ports:
      - "5000:5000"
      - "8554:8554"
      - "8555:8555/tcp"
      - "8555:8555/udp"

frigate.detectors.plugins.edgetpu_tfl ERROR : No EdgeTPU was detected

As you can see /dev/apex_0 is available and visible as pi user

lsmod | grep apex

apex 20480 0 gasket 102400 1 apex

lspci -nn | grep 089a

0000:01:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]

ls /dev/apex_0

/dev/apex_0

Click to expand! ### Issue Type Bug ### Operating System Linux ### Coral Device M.2 Accelerator with dual Edge TPU ### Other Devices _No response_ ### Programming Language Python 3.9 ### Relevant Log Output _No response_
Stefan2483 commented 6 months ago

I have the same issue after upgrading to 6.6.28+rpt-rpi-v8 lspci -nn | grep 089a 0000:01:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a] ls /dev/apex* /dev/apex_0 it is visible in docker after mounting it but frigate is not able to access it. Python3.9

TheNoobInventor commented 2 months ago

I had the same issue but after hours of reading issues and forum posts I think I have found a fix for it.

I'm using this dual Nvme shield by Geekworm. I'm currently using only one slot for the Coral TPU pcie and booting from an SD card. But I will test booting from an SSD and update my post with any findings.

UPDATE: It still works when I boot from an SSD on one of the NVMe slots with the Coral TPU in the other one.

My kernel version is: Linux pi5 6.6.31+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29) aarch64 GNU/Linux

I followed the suggestion from jdb in his forum post and added this line to the /boot/firmware/config.txt file:

dtoverlay=pciex1-compat-pi5,no-mip

After rebooting the Pi5, I was able to run the example PyCoral container to test the TPU using this Pineboards blog post. Then I tested Frigate and it worked well.

I hope this helps someone out.