google-coral / libedgetpu

Source code for the userspace level runtime driver for Coral.ai devices.
Apache License 2.0
181 stars 62 forks source link

Update libedgetpu repo for compatibility of recent versions of Tensorflow. #60

Closed feranick closed 6 months ago

feranick commented 7 months ago

This PR:

This is in relation to issues: https://github.com/tensorflow/tensorflow/issues/62371 and https://github.com/google-coral/libedgetpu/issues/53

feranick commented 6 months ago

@namburger I tested it with a local fork of TF 2.17.0 with the neon_fully_connected_arm32.cc reverted back to be in line with the other architectures, and compilation still fails even with the same error:

external/org_tensorflow/tensorflow/lite/kernels/internal/optimized/4bit/neon_fully_connected_arm32.cc: In function 'void tflite::optimized_4bit::NeonRunKernelNoSDot(const uint8_t*, const int8_t*, int32_t*, int, int, int, int, int, int) [with int RowsLeft = 4; int RowsRight = 1; int Cols = 32]':
external/org_tensorflow/tensorflow/lite/kernels/internal/optimized/4bit/neon_fully_connected_arm32.cc:192:7: error: 'asm' operand has impossible constraints
  192 |       asm volatile(KERNEL_4x1
      |       ^~~
INFO: Elapsed time: 34.950s, Critical Path: 17.60s

Note that this file (and the whole 4bit folder) was introduced with TF 2.14)

So maybe the commit was intended to address this problem, but didn't...

feranick commented 6 months ago

@Namburger Disclaimer: I know close to nothing about asm. Yet, I tried to address this error for what seems like a limitation in the architecture. Following some comments from here, I replaced "r" with "g" so that in external/org_tensorflow/tensorflow/lite/kernels/internal/optimized/4bit/neon_fully_connected_arm32.cc line 192 what is currently:

:
                   : [lhs_val] "r"(lhs_val), [rhs_val] "r"(rhs_val),
                     [element_ptr] "r"(element_ptr), [bit_shift] "r"(bit_shift),
                     [run_depth] "r"(run_depth)

is changed to:

:
                   : [lhs_val] "g"(lhs_val), [rhs_val] "g"(rhs_val),
                     [element_ptr] "g"(element_ptr), [bit_shift] "g"(bit_shift),
                     [run_depth] "g"(run_depth)

Compilation proceeds beyind this point, but it then stops with a similar error for another external library:

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
In file included from external/ruy/ruy/pack_arm.cc:16:
external/ruy/ruy/pack_arm.h:492:9: warning: multi-line comment [-Wcomment]
  492 | #endif  // (RUY_PLATFORM_NEON_64 || RUY_PLATFORM_NEON_32) && \
      |         ^
external/ruy/ruy/pack_arm.cc: In function 'void ruy::Pack8bitColMajorForNeon4Cols(const ruy::PackParams8bit&)':
external/ruy/ruy/pack_arm.cc:264:3: error: 'asm' operand has impossible constraints
  264 |   asm volatile(
      |   ^~~
INFO: Elapsed time: 32.683s, Critical Path: 20.21s

Not encouraging.

Namburger commented 6 months ago

hi @feranick with all of the changes, would you be able to send a set of commands to reproduce the very first issues? I believe it looks something like this? ( but may requires you to push some commits to your folk of tensorflow and then sync your fork of libedgetpu to it?)

git clone git@github.com:google-coral/libcoral.git && cd libcoral
git fetch origin pull/36/head:pull-36
git checkout pull-36
git submodule init && git submodule update
make DOCKER_IMAGE=debian:buster DOCKER_CPUS="armv7a" DOCKER_TARGETS=tests docker-build

I'm talking to some folks, trying to get an easy process to repro the original issues and continue on. It appears that element_ptr needs to be a r+ and there could be more issues as you pointed out so we'll need to work this out one step at a time, unfortunately :(

Namburger commented 6 months ago

A procedure to making code changes to tensorflow and test them against this build would be nice to have also, not sure if this can be easily done with bazel, though

mihaimaruseac commented 6 months ago

One idea would be to fork TensorFlow, make changes in your own fork (and sync from time to time) and then change https://github.com/google-coral/libedgetpu/blob/b5820ad9437548febcf58f593a665541e7c5e561/workspace.bzl#L68-L76 to point to the fork and the last commit there.

Then, once it all works, you can make a PR from the fork back to TF and once that lands you can change the workspace.bzl file to point back to upstream TF

feranick commented 6 months ago

@mihaimaruseac Thanks. This is effectively what I did for testing. An advantage of this is that ibedgetpu could still be built on a stable version of TF (2.16.0, forked and with the visibility patch). I'd note that libedgetpu can be used independently from libcoral/pycoral (in fact that is my way of using the edgeTPU). Yet it would allow to keep testing and building libcoral/pycoral, while libedgetpu would be "stable".

feranick commented 6 months ago

hi @feranick with all of the changes, would you be able to send a set of commands to reproduce the very first issues? I believe it looks something like this? ( but may requires you to push some commits to your folk of tensorflow and then sync your fork of libedgetpu to it?)

Hi @Namburger , here 's a slightly revised version of your commands (I can't clone via git@github.com:google-coral/libcoral.git, only via https). Also you should build using debian:bookworm as it has support for python3.11:

git clone https://github.com/google-coral/libcoral.git && cd libcoral
git fetch origin pull/36/head:pull-36
git checkout pull-36
git submodule init && git submodule update
make DOCKER_IMAGE=debian:bookworm DOCKER_CPUS="armv7a" DOCKER_TARGETS=tests docker-build

This commands will surely trigger the issue (just tested it). As for libedgetpu it currently builds against TF 2.16.0-rc0, but the issue here happens regardless of the TF version libedgetpu is built against. Can change it to TF 2.17.0+visibility if needed (yet, see comment)

Namburger commented 6 months ago

by mean of an update, the author of tensorflow/tensorflow@8419c70 promised to take a look sometimes this week!

feranick commented 6 months ago

by mean of an update, the author of tensorflow/tensorflow@8419c70 promised to take a look sometimes this week!

Hi @Namburger, just wondering whether there is an update on this... Thanks!

rushowr commented 1 month ago

Joining in the fray here, wondering how the future for the Coral TPU is going to look. Many of us out here bought this with hopes of affordable facial and object recog for software like Frigate using the Coral devices....

feranick commented 1 month ago

To be honest with you, the development is pretty much dead. Google is happy to sell you the boards, ASUS will do so for bulk sales, and yet the platform is dead. Shameful if you are asking me. This his why I embarked in the effort of keeping it alive, and with some success to merge it with the main repo. But I also have a fork for the latest. I am pretty sure that apart from some googler with some time to spare, Google basically killed their dev team. So I would not be invested too much into the platform, unless you are willing to put the work in.

rushowr commented 1 month ago

Luckily I only purchase a single pcie Coral. Unfortunately for me, I'm on a fixed income so the loss hurts just a smidge more than it would have if I was still at 6-figure income