google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
428 stars 125 forks source link

Unable to run edge tpu compiled models in docker container #170

Closed ItsMeTheBee closed 1 year ago

ItsMeTheBee commented 4 years ago

Hey there!

I´m using the Coral Mini PCIe Accelerator with an aarch64 debian buster based system. To make things complicated i´m running everything in an Ubuntu 20.04 LTS docker container with Python version 3.8.2.

I managed to install the coral drivers so ls /dev/apex_0 works in the docker container and i´ve build and installed a tflite_runtime wheel for my system according to the tensorflow repo like this

These examples work fine but when i try to run the classification example from this repo with the mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite model i get the same error:

``RuntimeError: Internal: Unsupported data type in custom op handler: 0Node number 1 (EdgeTpuDelegateForCustomOp) failed to prepare."

Running the mobilenet_v2_1.0_224_inat_bird_quant.tflite model works fine.

I tried using the latest tflite_runtime version as well as the 2.1.0 version. The output of dpkg -l | grep edgetpu is ii libedgetpu1-std:arm64 14.1 arm64 Support library for Edge TPU

I did create and install the wheel file for the Edge TPU Python API so not sure why it doesnt show up.

Originally posted by @ItsMeTheBee in https://github.com/google-coral/edgetpu/issues/44#issuecomment-656626248

Additional information: output of uname -a (in docker) : Linux 0de8b7b01cc7 4.19.59 #1 SMP PREEMPT Mon Jun 8 16:19:01 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

output of cat /etc/os-release (in docker): NAME="Ubuntu" VERSION="20.04 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal

output of dpkg -l | grep edgetpu (in docker): ii libedgetpu1-std:arm64 14.1 arm64 Support library for Edge TPU

output of python3 - c 'print(import("tflite_runtime").version)' (in docker): 2.1.0

I also tried using 2.1.1 but the behavior remained the same.

The Docker container has been build based on the ros:foxy image, opencv has been manually installed as well as a few other packages. I also tried installing the edgetpu api with this wheel file here but it didn´t make any difference.

I can create a Dockerfile without any unnecessary stuff if that helps.

Namburger commented 4 years ago

@ItsMeTheBee Thanks for opening, your issue is a little more specific and there isn't a straight forward answer. Is ubuntu20.04/python3.8 an absolute requirements?

The problem, it seems, is that you've built tflite_runtime package yourself since we don't have a released package for python3.8 yet. Because it's a requirements that the tflite_runtime runtime package is this version in order for it to works with libedgetpu:

 » python3 -c 'print(__import__("tflite_runtime").__git_version__)'     
0.6.0-76902-gd855adfc5a
 » python3 -c 'print(__import__("tflite_runtime").__version__)'              
2.1.0.post1

I also tried installing the edgetpu api with this wheel file here but it didn´t make any difference.

The edgetpu api is different from the tflite api, for that, you should use this example, and that should works! The difference is that the edgetpu api is our own wrapper code over the tensorflow's source code which make it more user friendly.

A quick fix suggestion is to use an older python until we starts supporting python3.8? Would that interfere with any other component of your system?

Namburger commented 4 years ago

Here is a Dockerfile for ubuntu18, with opencv and all necessary edgetpu libraries. I tried 20 but it has python3.8 by default, which is a little unfortunate:

FROM ubuntu:18.04
ENV TZ=US/Central
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

WORKDIR /home
ENV HOME /home
VOLUME /data
EXPOSE 8888
RUN cd ~
RUN apt-get update
RUN apt-get install -y git pkg-config wget usbutils curl apt-transport-https ca-certificates
RUN apt-get install -y python3-pip
RUN pip3 install --upgrade pip

RUN echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" \
| tee /etc/apt/sources.list.d/coral-edgetpu.list
RUN curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
RUN apt-get update
RUN apt-get install -y libedgetpu1-std python3-edgetpu python3-opencv

RUN wget https://dl.google.com/coral/python/tflite_runtime-2.1.0-cp36-cp36m-linux_x86_64.whl
RUN pip3 install tflite_runtime-2.1.0-cp36-cp36m-linux_x86_64.whl

RUN git clone https://github.com/google-coral/tflite.git
~                                                         

Build:

docker build -t "coral-ubuntu18" .

Run:

docker run -it --privileged -v /dev/bus/usb:/dev/bus/usb coral-ubuntu18 /bin/bash
cd tflite/python/examples/classification
./install_requirements.sh
python3 classify_image.py   --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite   --labels models/inat_bird_labels.txt   --input images/parrot.jpg
ItsMeTheBee commented 4 years ago

Thanks a lot for your fast answer! Unfortunately yes since i need to use ROS Foxy (i could build it with a different python version but that will probably lead to issues when i try to include other ROS packages).

If i check out the right tensorflow version and build tflite_runtime version 2.1.0.post1 it should work right?

Is there any way to compile python3-edgetpu for python 3.8? If not i´ll just use cpp because that seems tp work fine =)

By the way is there any speed diffference between the edgetpu api and the tflite api?

Namburger commented 4 years ago

Unfortunately yes since i need to use ROS Foxy (i could build it with a different python version but that will probably lead to issues when i try to include other ROS packages).

I see, sorry about that, ROS is one thing in this equation that I have no experience with :/

If i check out the right tensorflow version and build tflite_runtime version 2.1.0.post1 it should work right?

Yes, it should in theory but I've not tried it for python3.8 yet, I'd try from this commit also. If you do give it a shot, please do update me on the outcome :) I'm hoping that we'll support python3.8 soon.

Is there any way to compile python3-edgetpu for python 3.8?

This wheel package should already works with python3.8 :) https://dl.google.com/coral/edgetpu_api/edgetpu-2.14.0-py3-none-any.whl

By the way is there any speed diffference between the edgetpu api and the tflite api?

I think the only difference is that we attempts to do zero copy when possible, for instance: here is how the tflite API pass the input tensor from python code to c++ backend, while the edgetpu API just passes a pointer. This makes very minimal impact on performant for one inference especially on smaller inputs. But for larger image, it becomes quite a bottleneck. Although I must mention that currently the edgetpu API have tons of bloat, passing from python -> swig wrapper -> classification wrapper -> basic engine -> basic engine native while tflite api only goes from python -> pybind11 -> tflite_wrapper -> interpreter so there are some extra steps that we are taking with the edgetpu API. tflite API is also much more generic and is more harmony with tflite users who've never heard of the edgetpu, so it is my preferred API. I don't think I've actually benchmarked the difference, but those are the facts. FYI we're going through a big refactoring stage with the edgetpu API to make it more efficient :)

Hope this helps

ItsMeTheBee commented 4 years ago

Thanks a lot for all this information!

I checked out the commit you linked and added this little patch. Then i modified the Dockerfile in Tensorflow/lite/tools/pip_package like this:

`ARG IMAGE FROM ${IMAGE}

COPY update_sources.sh / RUN /update_sources.sh

RUN apt-get update && \ apt-get install -y software-properties-common && \ apt-add-repository universe && \ apt-get update

RUN dpkg --add-architecture armhf RUN dpkg --add-architecture arm64 RUN apt-get update && \ apt-get install -y \ python2 \ python-setuptools \ python-numpy \ libpython-all-dev \ libpython-all-dev:armhf \ libpython-all-dev:arm64

RUN apt-get update && \ apt-get install -y \ debhelper \ dh-python \ python-all \ python3-all \ python3-setuptools \ python3-wheel \ python3-numpy \ libpython3-dev \ libpython3-dev:armhf \ libpython3-dev:arm64 \ crossbuild-essential-armhf \ crossbuild-essential-arm64 \ zlib1g-dev \ zlib1g-dev:armhf \ zlib1g-dev:arm64 \ swig \ curl \ unzip \ git && \ apt-get clean

RUN curl https://bootstrap.pypa.io/get-pip.py --output get-pip.py && python2 get-pip.py`

To be able to create the wheel file with this command: make BASE_IMAGE=ros:foxy PYTHON=python3 TENSORFLOW_TARGET=aarch64 docker-build

I installed the wheel file and the tflite_runtime version seems to be 2.1.0 still but now the example files from the getting started page result in a different error ValueError: Model provided has model identifier 'CTYP', should be 'TFL3'

On the topic of the edgetpu api: I´m running into this error: ModuleNotFoundError: No module named '_edgetpu_cpp_wrapper'

I did find this issue and tried sudo ln -s _edgetpu_cpp_wrapper.cpython-35m-aarch64-linux-gnu.so _edgetpu_cpp_wrapper.cpython-38m-aarch64-linux-gnu.so but it didnt fiy my issue.

I did try a few other things but i thought this wheel file works with all python versions so i´m not sure where to go from here.

Namburger commented 4 years ago

@ItsMeTheBee Hi! Well getting a different error is definitely a good sign :)

ValueError: Model provided has model identifier 'CTYP', should be 'TFL3'

Could you check the model that you loaded during inputs? Usually this only happens if you loaded a file that isn't a tflite model. For instance, when I load a log file instead of tflite file:

python3 run_inference.py qat_quantized_edgetpu.log
.........
ValueError: Model provided has model identifier ' TPU', should be 'TFL3'

ModuleNotFoundError: No module named '_edgetpu_cpp_wrapper'

Ahh, apologies for the mis info. After further inspection, our swig wrapper were only built for python3.5-3.7 so that won't works with 3.8 :/

ItsMeTheBee commented 4 years ago

Could you check the model that you loaded during inputs? Usually this only happens if you loaded a file that isn't a tflite model.

I´m using the getting started tutorial for testing purposes so i´m in the directory ~/google-coral/tflite/python/examples/classification and executing python3 classify_image.py \ --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \ --labels models/inat_bird_labels.txt \ --input images/parrot.jpg

I tried it again using tab completion to make sure the path is right but the result stays the same.

Ahh, apologies for the mis info. After further inspection, our swig wrapper were only built for python3.5-3.7 so that won't works with 3.8 :/

No problem! Any chance I could use the build_non_swig script for python 3.8?

ItsMeTheBee commented 4 years ago

@Namburger

I played around with your files a bit and i´m either missing something completely obvious or the whole thing is a lot more complicated than i thought :D

I changed the files (code below since i cant attach them directly) to be able to run build_swig.sh in a Ubuntu 20.04 docker with python 3.8

The build was successful and generated _edgetpu_cpp_wrapper.cpython-38m-aarch64-linux-gnu.so, _edgetpu_cpp_wrapper.cpython-38m-aarch64-linux-gnu.so and _edgetpu_cpp_wrapper.cpython-38m-aarch64-linux-gnu.so But after installing the wheel file I still get the error No module named '_edgetpu_cpp_wrapper'

I checked and /usr/local/lib/python3.8/dist-packages/edgetpu/swig/ contains all three _edgetpu_cpp_wrapper.cpython-38m-[arch].so files.

Did I miss anything here?

Dockerfile:

ARG IMAGE
FROM ${IMAGE}

COPY update_sources.sh /
RUN /update_sources.sh

RUN dpkg --add-architecture armhf
RUN dpkg --add-architecture arm64

ENV TZ=Europe/Berlin
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

RUN apt-get update && apt-get install -y \
  sudo \
  debhelper \
  python \
  #python-future \
  python3-all \
  python3-numpy \
  python3-setuptools \
  python3-six \
  python3-wheel \
  python3-dev \
  libpython3-dev \
  libpython3-dev:armhf \
  libpython3-dev:arm64 \
  build-essential \
  crossbuild-essential-armhf \
  crossbuild-essential-arm64 \
  libusb-1.0-0-dev \
  libusb-1.0-0-dev:arm64 \
  libusb-1.0-0-dev:armhf \
  zlib1g-dev \
  zlib1g-dev:armhf \
  zlib1g-dev:arm64 \
  pkg-config \
  zip \
  unzip \
  curl \
  wget \
  git \
  vim

RUN git clone https://github.com/raspberrypi/tools.git && \
    cd tools && \
    git reset --hard 4a335520900ce55e251ac4f420f52bf0b2ab6b1f

ARG BAZEL_VERSION=2.1.0
RUN wget -O /bazel https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh && \
    bash /bazel && \
    rm -f /bazel

third_party/python/linux/BUILD

config_setting(
    name = "py35",
    define_values = {"PY3_VER": "35"}
)

config_setting(
    name = "py36",
    define_values = {"PY3_VER": "36"}
)

config_setting(
    name = "py37",
    define_values = {"PY3_VER": "37"}
)

config_setting(
    name = "py38",
    define_values = {"PY3_VER": "38"}
)

cc_library(
  name = "python3-headers",
  hdrs = select({
    "py35": glob(["python3.5m/*.h",
                  "python3.5m/numpy/*.h",
                  "aarch64-linux-gnu/python3.5m/*.h",
                  "arm-linux-gnueabihf/python3.5m/*.h"]),
    "py36": glob(["python3.6m/*.h",
                  "python3.6m/numpy/*.h",
                  "aarch64-linux-gnu/python3.6m/*.h",
                  "arm-linux-gnueabihf/python3.6m/*.h"]),
    "py37": glob(["python3.7m/*.h",
                  "python3.7m/numpy/*.h",
                  "aarch64-linux-gnu/python3.7m/*.h",
                  "arm-linux-gnueabihf/python3.7m/*.h"]),
    "py38": glob(["python3.8/*.h",
                  "python3.8/numpy/*.h",
                  "python3.8/cpython/*.h",
                  "aarch64-linux-gnu/python3.8/*.h",
                  "arm-linux-gnueabihf/python3.8/*.h"]),
  }, no_match_error = "PY3_VER is not specified"),
  includes = select({
    "py35": [".", "python3.5m"],
    "py36": [".", "python3.6m"],
    "py37": [".", "python3.7m"],
    "py38": [".", "python3.8", "python3.8/cpython",],
  }, no_match_error = "PY3_VER is not specified"),
  visibility = ["//visibility:public"],
)

build_swig.sh

#!/bin/bash
#
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
set -x

readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly MAKEFILE="${SCRIPT_DIR}/../Makefile"
PYTHON_VERSIONS="38"

while [[ $# -gt 0 ]]; do
  case "$1" in
    --clean)
      make -f "${MAKEFILE}" clean
      shift
      ;;
    --python_versions)
      PYTHON_VERSIONS=$2
      shift
      shift
      ;;
    *)
      shift
      ;;
  esac
done

function docker_image {
  case $1 in
    35) echo "ubuntu:16.04" ;;
    36) echo "ubuntu:18.04" ;;
    37) echo "debian:buster" ;;
    38) echo "ubuntu:20.04" ;;
    *) echo "Unsupported python version: $1" 1>&2; exit 1 ;;
  esac
}

for python_version in ${PYTHON_VERSIONS}; do
  make DOCKER_IMAGE=$(docker_image "${python_version}") DOCKER_TARGETS=swig -f "${MAKEFILE}" docker-build
done
Namburger commented 4 years ago

@ItsMeTheBee can you try using the experimental edgetpu API wheel package we built from here? https://github.com/google-coral/edgetpu/issues/132#issuecomment-671509187 Also, can you share the md5sum models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite or cat-ting it? I wonder if the download link may have expired and you ended up with a fake file

Namburger commented 4 years ago

@ItsMeTheBee Here is a complete dockerfile to shows that the edgetpu wheel is working :) https://gist.github.com/Namburger/fc1e697f9415a1cb98cd95c74facd4de

(the tflite_runtime is still in the work)

hjonnala commented 1 year ago

Feel free to check the example on how to run edge tpu models in docker container at: https://github.com/blakeblackshear/frigate/discussions/2599#discussioncomment-3621686

google-coral-bot[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No