Closed mrgloom closed 7 years ago
Hi!
We are actually working on that. We plan to provide new crowd-benchmarking capability for PC GPUs before May. In fact, we already have some stats in the same repository (cKnowledge.org/repo) under different crowd-scenario "crowd-benchmark Caffe library" but it's quite outdated (we have been recently focusing on embedded devices). We plan to update it before May.
As for forward pass benchmark, I will let @psyhtest replay - he is dealing with different CNN architectures ...
@mrgloom A very nice table! With CK-Caffe and Jupyter, it would be easy to obtain results for your table on another machine, as well as display results from several machines. If you volunteer :), I can help you convert your repository into the CK format.
Seems you aready have forward pass measure functionality via caffe-time
https://github.com/dividiti/ck-caffe/wiki/Installation#linux
So how to add other deploy.protoxt
network architectures to benchmark?
@mrgloom
Seems you aready have forward pass measure functionality via caffe-time
That's right. To start with, you can simply run:
$ ck compile program:caffe-time
$ ck run program:caffe-time
@mrgloom
So how to add other deploy.protoxt network architectures to benchmark?
It's rather straightforward. We've added quite a few model descriptions recently e.g. JacintoNet11, VGG16, VGG19, ResNet50, ResNet101, ResNet152.
You start by adding a new package
to a CK repository (ck-caffe
in our case) e.g.:
$ ck add ck-caffe:package:caffemodel-tidsp-jacintonet11-non-sparse
This command automatically creates a directory in your repository (e.g. package/caffemodel-tidsp-jacintonet11-non-sparse
), a pair of alias files (under package/.cm
), three metadata files (under package/caffemodel-tidsp-jacintonet11-non-sparse/.cm
). You just need to customise the meta.json
file (by starting from a copy of an existing meta.json
from one of the descriptions mentioned above) and add your deploy.prototxt
(and any other files if you wish) to this directory.
So while this is easy, the more difficult question to answer is why CNNs for solving the Kaggle cats and dogs challenge would make a good PC benchmark? Is this workload relevant for PC users?
I don't want to focus on Kaggle cats vs dogs challenge, but on forward/backward pass measure of CNN models that I have used in this challenge.
I will try to run program:caffe-time
and add my own models using existing models as example.
Note, that very recently I added a simpler way to download and install various packages in the CK across different platforms, so we are gradually moving current packages in Caffe/TF to this format (to be able to universally install them for Linux, Windows and Android). You can find example for this LLVM 4.0.0 universal package here: https://github.com/ctuning/ck-env/tree/master/package/compiler-llvm-4.0.0-universal see meta.json & custom.py ... For models, you probably just need meta.json without custom.py ...
Do you have forward pass benchmark of different CNN architectures? it's not very clear from documentation.
I just want to measure forward pass time, something like here, but table should contain different GPU models.
Also do you have results for PC GPU's? Here seems only some results for mobile devices.