ctuning / ck-tensorflow

Collective Knowledge components for TensorFlow (code, data sets, models, packages, workflows):
http://cKnowledge.org
BSD 3-Clause "New" or "Revised" License
93 stars 26 forks source link

Tensorflow Lite and/or cross compilation #59

Closed mohamedadaly closed 5 years ago

mohamedadaly commented 6 years ago

Hi

I am new to CK, and have been using Tensorflow Lite on an embedded ARM-based device. I have two questions:

Thanks a lot.

psyhtest commented 6 years ago

Hi @mohamedadaly

Thanks for your interest in CK! To answer your questions:

Does this package also build Tensorflow Lite, or just the whole of Tensorflow?

CK-TensorFlow supports building TensorFlow from source (CPU, CPU+XLA, CUDA, CUDA+XLA; v1.4.0 .. v1.8.0), as well as TensorFlow_CC. As we are looking to run (the TensorFlow part of) our ReQuEST submission on Android devices, we may need to start supporting TensorFlow Lite as well. Please stay tuned!

/cc @Chunosov @gfursin

Does it also support cross-compilation? My device is quite resource limited and slow, and I would prefer to avoid a native build on it, as I do now.

I totally agree with you. Here's some recent stats for building TensorFlow 1.8.0 on several aarch64 platforms:

Unfortunately, cross-compiling TensorFlow is not an easy fit:

it can be challenging to run directly on devices that have limited resources, such as the Raspberry Pi. It’s also not easy to set up cross-compilation if you’re compiling on a different machine than you’re deploying to

(according to @petewarden in "Building Mobile Applications with TensorFlow")

If we would get some reasonable instructions for cross-compiling TensorFlow, we could incorporate them into CK-TensorFlow. Do you perhaps know of any such instructions to start with?

psyhtest commented 6 years ago

In fact, some instructions are referenced from https://github.com/ctuning/ck-tensorflow/issues/44.

mohamedadaly commented 6 years ago

We had a very hard time cross-compiling Tensorflow, but TF Lite is a lot easier. For example, to cross-compile for the Pi, you could do:

# Set CC and CXX to your toolchain binaries

# compile TF Lite
make -f tensorflow/contrib/lite/Makefile CXX="${CXX}" CC="${CC}" LIBS="-lstdc++ -lpthread -lm -ldl" all -j4 TARGET=RPI TARGET_ARCH=armv7
psyhtest commented 6 years ago

Thanks @mohamedadaly ! It's probably going to be more tricky for Android, but we'll look into it.

mohamedadaly commented 6 years ago

Thanks!

In case it's helpful, they seem to have instructions for building an Android demo app here.

gfursin commented 6 years ago

Thanks a lot @mohamedadaly for details. If it's indeed possible to build TFLite with just one line while explicitly specifying all compilers and libs, it should be possible to build it via CK. We will have a look at it as soon as we complete a few urgent things. We will keep in touch!

Chunosov commented 6 years ago

Now we have TFLite package supporting cross-compilation for Android: https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tflite-1.7.0-src-static

gfursin commented 6 years ago

@mohamedadaly - you can try the following now:

$ ck pull all
$ ck install package:lib-tflite-1.7.0-src-static --target_os=android21-arm64
$ ck compile program:tflite-classification --target_os=android21-arm64
$ ck run program:tflite-classification --target_os=android21-arm64

I just tested it and it works. I also added the same scenarios to our Android app:

Public benchmarking results are then collected here:

Hope it's of any help!

mohamedadaly commented 6 years ago

@gfursin Awesome. Thanks!

I was wondering how easy/difficult it would be to have similar packages for Raspberry Pi 2 (arm32 with soft/hard float) and Pi 3 (aarch64 with soft/hard float)? Where do I need to make changes etc. to support that?

psyhtest commented 6 years ago

@mohamedadaly You are welcome! As for arm32 and aarch64 platforms, do you mean building natively or cross-compiling? The former should already work. Do you mind to try the same on your platforms without the --target_os=android21-arm64 flag?

$ ck pull all
$ ck install package:lib-tflite-1.7.0-src-static
$ ck compile program:tflite-classification
$ ck run program:tflite-classification

Please report any problems!

psyhtest commented 6 years ago

Hmm, it appears that TFLite is currently at version v0.1.7:

TensorFlow Lite 0.1.7 is based on tag tflite-v0.1.7 (git commit fa1db5eb0da85b5baccc2a46d534fdeb3bb473d0).
To reproduce the iOS library, it's required to cherry pick git commit f1f1d5172fe5bfeaeb2cf657ffc43ba744187bee to fix a dependency issue.
The code is based on TensorFlow 1.8.0 release candidate and it's very close to TensorFlow 1.8.0 release.

So should the TFLite package version be 0.1.7 instead of 1.7.0 or is it so called because it's close to TF 1.7.0?

mohamedadaly commented 6 years ago

@psyhtest Yes, I mean cross compiling. I will try building natively on my RPi 2 and let you know how it goes. Please let me know if it's possible to cross-compile for the Pi 2/3.

mohamedadaly commented 6 years ago

@psyhtest I am actually not sure how the version numbers work for TF Lite :) The tflite repo seems to be updated after May 7, the last update for the RELEASE.md file.

mohamedadaly commented 6 years ago

It builds fine on the Raspberry Pi 2, and gives the following timing information:

  (reading fine grain timers from tmp-ck-timer.json ...)

{
  "execution_time": 0.003857, 
  "execution_time_kernel_0": 0.003857, 
  "execution_time_kernel_1": 0.114619, 
  "execution_time_kernel_2": 2.752184
}

Execution time: 0.004 sec.

I am wondering what the different timings are e.g. execution_time_kernel_1?

Chunosov commented 6 years ago

execution_time_kernel_0 - model loading time execution_time_kernel_1 - image loading and preparation time execution_time_kernel_2 - classification time

But I'm not sure why execution_time only accounts for the first of them. We should consult with @gfursin here.

Chunosov commented 6 years ago

So should the TFLite package version be 0.1.7 instead of 1.7.0 or is it so called because it's close to TF 1.7.0?

Currently, we take sources by the tag v1.7.0 and there was no tag tflite-v0.1.7 and the file RELEASE.md at that moment. So maybe it was not released at all when we'd created our package :) Now we can switch to the tag tflite-v0.1.7 and rename the package.

gfursin commented 6 years ago

As for execution_time_kernel_0 which attributed to execution_time - it's just a matter of conventions. For different reasons, we originally made execution_time=execution_time_kernel_0 when exposed from external program via xOpenME (I already don't remember why) and we keep it like this for compatibility reasons. However, all higher-level CK modules and analysis scripts, when they see execution_time_kernel_X, they do not look at overall execution_time, but look at individual kernel times (and sum them as they need). If you look at stats in cKnowledge.org/repo, you will see 3 timings and overall time which is summed when preparing a table ...

gfursin commented 6 years ago

It's also possible to make a post-processing script and print "user-friendly" timer names, but we just didn't have time. So all our image classifies return these three timers as a standard (this allows us to add such scenarios to our crowd-benchmarking Android app) ...

BTW, @mohamedadaly , it's cool that tfilte and image classification worked on RPi!

mohamedadaly commented 6 years ago

@gfursin @Chunosov Thanks! That makes sense.

Is it possible to share the commands/steps used to create the library, program, scripts, etc. for TF Lite? I would like to try out cross-compiling for RPi 2/3.

Chunosov commented 6 years ago

We just use the Makefile provided in tensorflow/contrib/lite/ and only have patched it slightly to let it know about NDK path. See the dir in our repo: https://github.com/ctuning/ck-tensorflow/tree/master/script/install-lib-tflite-src-static. But it seemed already knowing about RPi and it has the build target RPI. See tensorflow/contrib/lite/rpi_makefile.inc in TF sources. So maybe it just enough to set some paths to RPi cross-compiler.

gfursin commented 6 years ago

Most of the time we specialize compilation via environment, i.e.

$ ck install package:lib-tflite-1.7.0-src-static --env.USE_RPI=ON

and then specialize install scripts or Makefile (as @Chunosov said) in the CK package to use this flag ... If a major change in building process is required, we then create a new package, but I don't think this is the case.

BTW, I updated CK documentation recently and added some basic info about how to add new soft/packages: https://github.com/ctuning/ck/wiki/Adding-new-workflows . However, if it's complex, we can still help either here or via or CK mailing list or Slack channel:

mohamedadaly commented 6 years ago

Thanks! I will have a go at it, and will let you know if I have questions..

psyhtest commented 5 years ago

Closing due to inactivity. Feel free to reopen.