Closed h4gen closed 7 years ago
Hi, great to see progress made. These layers are not yet recognized by the CNTK importer, which focuses more on convolutional networks as a starting point.
For logistic regression, you can try updating tools/importers/CNTK/cntk_to_ell.py to handle the Times and Plus op_names for your model's needs.
We are in the process of documenting how to update the importer, and refactoring it to make it easier to update, but here are a few notes to get you started meanwhile:
Okay thanks, I will try to fix it. But I am still a little bit puzzled. How does the neural network implementation work if there is no plus and times operation? I mean, basically a logistic regression does the same thing as a perceptron. I would assume that it does nearly the same as your linear layer in the Library. Is there maybe a possible workaround to create a model in cntk that does not need times and plus? Unfortunately I could not not find something like a linear layer in cntk.
I have one more question regarding the expected workflow with ELL: Right now I interpret it more as a compiler for cntk models. But basically I could just design the whole model with ELL, right? Is there a recommended workflow how to use ELL?
Hi there!
I tried implementing the processing for the layers. I added this code:
def process_plus_layer(layer,ellLayers):
biasParameter = findParameterByName(layer.parameters, 'b', 0)
biasVector = get_float_vector_from_cntk_trainable_parameter(biasParameter)
layerParameters = ELL.LayerParameters(layer.ell_outputShapeMinusPadding, ELL.NoPadding(
), layer.ell_outputShape, layer.ell_outputPaddingParameters)
ellLayers.append(ELL.FloatBiasLayer(layerParameters, biasVector))
return
def process_times_layer(layer,ellLayers):
weightsParameter = findParameterByName(layer.parameters, 'W', 0)
weightsTensor = get_float_tensor_from_cntk_dense_weight_parameter(weightsParameter)
layerParameters = ELL.LayerParameters(layer.ell_inputShape, layer.ell_inputPaddingParameters, layer.ell_outputShapeMinusPadding, ELL.NoPadding())
ellLayers.append(ELL.FloatFullyConnectedLayer(layerParameters, weightsTensor))
return
and added to convert_cntk_layers_to_ell_layers
elif (cntkLayer.op_name == 'Times'):
process_times_layer(cntkLayer, ellLayers)
elif (cntkLayer.op_name == 'Plus'):
process_plus_layer(cntkLayer, ellLayers)
and so to get_filtered_layers_list
elif ((currentLayer.op_name == 'Dense') or
...
(currentLayer.op_name == 'Times') or
(currentLayer.op_name == 'Plus')
After this the ell pre processing and the compiling works fine! This is the output:
Finished loading.
Pre-processing...
Times : 1x1x2 -> 1x1x2 | padding 0
Plus : 1x1x2 -> 1x1x2 | padding 0
Softmax : 1x1x2 -> 1x1x2 | padding 0
Finished pre-processing.
Constructing equivalent ELL layers from CNTK...
Converting layer Times(Tensor[2]) -> Tensor[2]
Converting layer Plus(Tensor[2]) -> Tensor[2]
Converting layer Softmax(Tensor[2]) -> Tensor[2]
...Finished constructing ELL layers.
compiling and cross compiling is done via:
compile -imap mymodel.map --header --ir
llc-3.9 -mtriple=armv7m-unknown-none-eabi -march=thumb -mcpu=cortex-m3
-mattr=+armv7-m,+v7 -float-abi=soft -filetype=obj mymodel.ll
Then I get this error, when injecting the code in my project and compile it:
In function `_Node__MatrixVectorMultiplyNode_float__in_4_2_out_2':
mymodel.ll:(.text+0xe4): undefined reference to `cblas_sgemv
This is the model I export from cntk (I added a plus layer in comparison to the original from the tutorial)
weight_param = C.parameter(shape=(input_dim, output_dim),name='W')
bias_param = C.parameter(shape=(output_dim),name='b')
C.softmax(C.plus(C.times(input_var, weight_param), bias_param))
Any ideas whats going wrong?
Okay, immediately found my error. The compile
command has to be called with --blas false
, so that there are no function calls to BLAS.
Glad you found the error and was able to implement the layers you need!
As for your questions:
How does the neural network implementation work if there is no plus and times operation?
A: we do support ElementTimes and Bias. Some CNTK layers do similar operations (depending on whether you are calling a function or block), and the convolutional networks that we started on for vision happen not to be using Times and Plus. We're actually working on adding "Plus" in the next release because we've encountered other networks that need it. The choice of the CNTK layer is done at model design time, and we happened to be using ElementTimes instead of Times.
I have one more question regarding the expected workflow with ELL: Right now I interpret it more as a compiler for cntk models. But basically I could just design the whole model with ELL, right? Is there a recommended workflow how to use ELL?
A: You can design a model with ELL from the ground up, especially if the model is fairly simple like what you tried. However, since ELL doesn't support training (yet), it is often useful to author and train more complex models using CNTK, and then import that model into ELL. The "import into ELL" workflow also applies if you have trained existing models (e.g. darknet).
Hi There!
So far I am happy to have a working CNTK -> CortexM3 Pipeline. But I was wondering, why my results on the hardware differ so much from the test results in the Simulation in CNTK. I used the simple Logistic Regression Example and deployed it to my Hardware (Bosch XDK). When I saw, that the results differ, I recognised the message of the ELL compiler stating:
Obviously leaving out the times and plus just leads to an evaluation of the input data by the softmax. How can it be, that these layers are marked as irrelevant? How can I influence it? Is this a bug or a feature?
Best, Hagen