** =predictor/= A simple C++ library that wraps the common pattern of running a =caffe::Net= in multiple threads while sharing weights. It also provides a slightly more convenient usage API for the inference case.
// In your setup phase predictor_ = Predictor::paths(FLAGS_prototxt_path, FLAGS_weights_path, FLAGS_optimization);
// When calling in a worker thread
static thread_local caffe::Blob
Of note is the =predictor/Optimize.{h,cpp}=, which optimizes memory usage by automatically reusing the intermediate activations when this is safe. This reduces the amount of memory required for intermediate activations by around 50% for AlexNet-style models, and around 75% for GoogLeNet-style models.
We can plot each set of activations in the topological ordering of the network, with a unique color for each reused activation buffer, with the height of the blob proportional to the size of the buffer.
For example, in an AlexNet-like model, the allocation looks like
[[./doc/caffenet.png]]
A corresponding allocation for GoogLeNet looks like
[[./doc/googlenet.png]]
The idea is essentially linear scan register allocation. We
Depending on the model, the buffer reuse can also lead to some non-trivial performance improvements at inference time.
To enable this just pass =Predictor::Optimization::MEMORY= to the =Predictor= constructor.
=predictor/PooledPredictor{.h,cpp}= maintains a thread-pool with thread-local instances of =caffe::Net=. Calls to =PooledPredictor::forward()= are added to a =folly::MPMCQueue=, which are then dequeued by the thread-pool for processing. Calls to =forward()= are non-blocking and return a =folly::Future= that will be satisfied when the forward pass job finishes. =PooledPredictor= also supports running multiple models over the same thread-pool. That is, if you load two models, each thread in the thread-pool will maintain two instances of =caffe::Net= (one for each model), and the =netId= param in =forward()= specifies the model to run. =PinnedPooledPredictor= is an abstraction over =PooledPredictor= when used with multiple models to pin the =forward()= calls to a specific model.
// In your setup phase caffe::fb::PooledPredictor::Config config; config.numThreads = 10; config.optimization = caffe::fb::Predictor::Optimization::MEMORY; config.protoWeightPaths_.emplace_back(FLAGS_prototxt_path, FLAGS_weightspath); pooledPredictor = caffe::fb::PooledPredictor::makePredictor(config);
// When calling predictor caffe::fb::PooledPredictor::OutputLayers outputblobs; pooledPredictor->forward({&input_blob}, &output_blobs) .then([&] { const auto& output_blob = outputs_blobs[FLAGS_output_layer_name]; // Do something with output_blob });
** =torch2caffe/= A library for converting pre-trained Torch models to the equivalent Caffe models.
=torch_layers.lua= describes the set of layers that we can automatically convert, and =test.lua= shows some examples of more complex models being converted end to end.
For example, complex CNNs ([[http://arxiv.org/abs/1409.4842][GoogLeNet]], etc), deep LSTMs (created in [[https://github.com/torch/nngraph][nngraph]]), models with tricky parallel/split connectivity structures ([[http://arxiv.org/abs/1103.0398][Natural Language Processing (almost) from Scratch]]), etc.
This can be invoked as
∴ th torch2caffe/torch2caffe.lua --help --input (default "") Input model file --preprocessing (default "") Preprocess the model --prototxt (default "") Output prototxt model file --caffemodel (default "") Output model weights file --format (default "lua") Format: lua | luathrift --input-tensor (default "") (Optional) Predefined input tensor --verify (default "") (Optional) Verify existing