Closed mohamedadaly closed 5 years ago
Hi @mohamedadaly
Thanks for your interest in CK! To answer your questions:
Does this package also build Tensorflow Lite, or just the whole of Tensorflow?
CK-TensorFlow supports building TensorFlow from source (CPU, CPU+XLA, CUDA, CUDA+XLA; v1.4.0 .. v1.8.0), as well as TensorFlow_CC. As we are looking to run (the TensorFlow part of) our ReQuEST submission on Android devices, we may need to start supporting TensorFlow Lite as well. Please stay tuned!
/cc @Chunosov @gfursin
Does it also support cross-compilation? My device is quite resource limited and slow, and I would prefer to avoid a native build on it, as I do now.
I totally agree with you. Here's some recent stats for building TensorFlow 1.8.0 on several aarch64 platforms:
Firefly RK3399 (4 GB RAM, with swap):
$ ck install package:lib-tensorflow-1.8.0-src-cpu --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=3
...
Installation time: 24036.3912299 sec.
Linaro HiKey960 (3 GB RAM, with swap):
$ ck install package:lib-tensorflow-1.8.0-src-cpu --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=2
...
Installation time: 28039.245172 sec.
Jetson TX1 (4 GB RAM, without swap):
$ ck install package:lib-tensorflow-1.8.0-src-cpu --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=1
...
Installation time: 51031.3029599 sec.
(Apparently, it takes about 3 hours on Jetson TX2.)
Unfortunately, cross-compiling TensorFlow is not an easy fit:
it can be challenging to run directly on devices that have limited resources, such as the Raspberry Pi. It’s also not easy to set up cross-compilation if you’re compiling on a different machine than you’re deploying to
(according to @petewarden in "Building Mobile Applications with TensorFlow")
If we would get some reasonable instructions for cross-compiling TensorFlow, we could incorporate them into CK-TensorFlow. Do you perhaps know of any such instructions to start with?
In fact, some instructions are referenced from https://github.com/ctuning/ck-tensorflow/issues/44.
We had a very hard time cross-compiling Tensorflow, but TF Lite is a lot easier. For example, to cross-compile for the Pi, you could do:
# Set CC and CXX to your toolchain binaries
# compile TF Lite
make -f tensorflow/contrib/lite/Makefile CXX="${CXX}" CC="${CC}" LIBS="-lstdc++ -lpthread -lm -ldl" all -j4 TARGET=RPI TARGET_ARCH=armv7
Thanks @mohamedadaly ! It's probably going to be more tricky for Android, but we'll look into it.
Thanks!
In case it's helpful, they seem to have instructions for building an Android demo app here.
Thanks a lot @mohamedadaly for details. If it's indeed possible to build TFLite with just one line while explicitly specifying all compilers and libs, it should be possible to build it via CK. We will have a look at it as soon as we complete a few urgent things. We will keep in touch!
Now we have TFLite package supporting cross-compilation for Android: https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tflite-1.7.0-src-static
@mohamedadaly - you can try the following now:
$ ck pull all
$ ck install package:lib-tflite-1.7.0-src-static --target_os=android21-arm64
$ ck compile program:tflite-classification --target_os=android21-arm64
$ ck run program:tflite-classification --target_os=android21-arm64
I just tested it and it works. I also added the same scenarios to our Android app:
Public benchmarking results are then collected here:
Hope it's of any help!
@gfursin Awesome. Thanks!
I was wondering how easy/difficult it would be to have similar packages for Raspberry Pi 2 (arm32 with soft/hard float) and Pi 3 (aarch64 with soft/hard float)? Where do I need to make changes etc. to support that?
@mohamedadaly You are welcome! As for arm32
and aarch64
platforms, do you mean building natively or cross-compiling? The former should already work. Do you mind to try the same on your platforms without the --target_os=android21-arm64
flag?
$ ck pull all
$ ck install package:lib-tflite-1.7.0-src-static
$ ck compile program:tflite-classification
$ ck run program:tflite-classification
Please report any problems!
Hmm, it appears that TFLite is currently at version v0.1.7:
TensorFlow Lite 0.1.7 is based on tag tflite-v0.1.7 (git commit fa1db5eb0da85b5baccc2a46d534fdeb3bb473d0).
To reproduce the iOS library, it's required to cherry pick git commit f1f1d5172fe5bfeaeb2cf657ffc43ba744187bee to fix a dependency issue.
The code is based on TensorFlow 1.8.0 release candidate and it's very close to TensorFlow 1.8.0 release.
So should the TFLite package version be 0.1.7
instead of 1.7.0
or is it so called because it's close to TF 1.7.0
?
@psyhtest Yes, I mean cross compiling. I will try building natively on my RPi 2 and let you know how it goes. Please let me know if it's possible to cross-compile for the Pi 2/3.
@psyhtest I am actually not sure how the version numbers work for TF Lite :) The tflite repo seems to be updated after May 7, the last update for the RELEASE.md file.
It builds fine on the Raspberry Pi 2, and gives the following timing information:
(reading fine grain timers from tmp-ck-timer.json ...)
{
"execution_time": 0.003857,
"execution_time_kernel_0": 0.003857,
"execution_time_kernel_1": 0.114619,
"execution_time_kernel_2": 2.752184
}
Execution time: 0.004 sec.
I am wondering what the different timings are e.g. execution_time_kernel_1
?
execution_time_kernel_0
- model loading time
execution_time_kernel_1
- image loading and preparation time
execution_time_kernel_2
- classification time
But I'm not sure why execution_time
only accounts for the first of them. We should consult with @gfursin here.
So should the TFLite package version be 0.1.7 instead of 1.7.0 or is it so called because it's close to TF 1.7.0?
Currently, we take sources by the tag v1.7.0
and there was no tag tflite-v0.1.7
and the file RELEASE.md
at that moment. So maybe it was not released at all when we'd created our package :)
Now we can switch to the tag tflite-v0.1.7
and rename the package.
As for execution_time_kernel_0 which attributed to execution_time - it's just a matter of conventions. For different reasons, we originally made execution_time=execution_time_kernel_0 when exposed from external program via xOpenME (I already don't remember why) and we keep it like this for compatibility reasons. However, all higher-level CK modules and analysis scripts, when they see execution_time_kernel_X, they do not look at overall execution_time, but look at individual kernel times (and sum them as they need). If you look at stats in cKnowledge.org/repo, you will see 3 timings and overall time which is summed when preparing a table ...
It's also possible to make a post-processing script and print "user-friendly" timer names, but we just didn't have time. So all our image classifies return these three timers as a standard (this allows us to add such scenarios to our crowd-benchmarking Android app) ...
BTW, @mohamedadaly , it's cool that tfilte and image classification worked on RPi!
@gfursin @Chunosov Thanks! That makes sense.
Is it possible to share the commands/steps used to create the library, program, scripts, etc. for TF Lite? I would like to try out cross-compiling for RPi 2/3.
We just use the Makefile
provided in tensorflow/contrib/lite/
and only have patched it slightly to let it know about NDK path. See the dir in our repo: https://github.com/ctuning/ck-tensorflow/tree/master/script/install-lib-tflite-src-static.
But it seemed already knowing about RPi and it has the build target RPI
. See tensorflow/contrib/lite/rpi_makefile.inc
in TF sources. So maybe it just enough to set some paths to RPi cross-compiler.
Most of the time we specialize compilation via environment, i.e.
$ ck install package:lib-tflite-1.7.0-src-static --env.USE_RPI=ON
and then specialize install scripts or Makefile (as @Chunosov said) in the CK package to use this flag ... If a major change in building process is required, we then create a new package, but I don't think this is the case.
BTW, I updated CK documentation recently and added some basic info about how to add new soft/packages: https://github.com/ctuning/ck/wiki/Adding-new-workflows . However, if it's complex, we can still help either here or via or CK mailing list or Slack channel:
Thanks! I will have a go at it, and will let you know if I have questions..
Closing due to inactivity. Feel free to reopen.
Hi
I am new to CK, and have been using Tensorflow Lite on an embedded ARM-based device. I have two questions:
Thanks a lot.