Closed syedaffanhamdani closed 5 years ago
I'm afraid in this version if you actually use the fragment in the nnef file, then yes, you need to define a shape inference function.
The conversion of custom operations is not yet documented, but let me write you an example to get this working. I'm not sure what would you like to achieve, but here is the example:
I assume you have this custom_quantize.nnef file:
extension KHR_enable_fragment_definitions;
extension KHR_enable_operator_expressions;
fragment custom_quantize(x: tensor<scalar>, min: tensor<scalar>,max: tensor<scalar>,bits: integer ) -> ( y: tensor<scalar> )
{
r=scalar(2 ^ bits - 1);
z = clamp(x, min, max);
q = round((z - min) / (max - min) * r);
y = q / r * (max - min) + min;
}
graph network(i) -> (o)
{
i = external(shape=[1, 2, 3]);
o = custom_quantize(i, min=0.0, max=1.0, bits=3);
}
To convert this to for example tensorflow python code, you have to create a custom module (for example: custom_nnef_ops.py)
# no need to define here, if the fragment is present in the nnef file
NNEF_OP_DEFINITIONS = ""
# we have to lower it if we don't want to write custom converter.
NNEF_LOWERED_OPS = ["custom_quantize"]
def custom_quantize_prop(x, min, max, bits):
return x
NNEF_SHAPE_PROPAGATORS = {
"custom_quantize": custom_quantize_prop,
}
And you have to use this command to convert:
./nnef_tools/convert.py \
--input-format nnef \
--output-format tensorflow-py \
--input-model custom_quantize.nnef \
--custom-converters custom_nnef_ops
This is the output of the converter:
from __future__ import division, print_function, absolute_import
from collections import OrderedDict
import tensorflow as tf
def network():
t_Sub = tf.subtract(x=1.0, y=0.0)
t_Sub_1 = tf.subtract(x=1.0, y=0.0)
t_i = tf.placeholder(shape=[1, 2, 3], dtype=tf.float32, name='i')
t_clip_by_value = tf.clip_by_value(t=t_i, clip_value_min=0.0, clip_value_max=1.0)
t_Sub_2 = tf.subtract(x=t_clip_by_value, y=0.0)
t_truediv = tf.divide(x=t_Sub_2, y=t_Sub_1)
t_Mul = tf.multiply(x=t_truediv, y=7.0)
t_Round = tf.round(x=t_Mul)
t_truediv_1 = tf.divide(x=t_Round, y=7.0)
t_Mul_1 = tf.multiply(x=t_truediv_1, y=t_Sub)
t_Add = tf.add(x=t_Mul_1, y=0.0)
return OrderedDict([
("o", t_Add)
])
A bundle of thanks but I only want to quantize the weights using different algorithms(not with a static scale). My understanding was that NNEF quantizes binary weights stored in DAT files and then just exports in the TensorFlow. Having the quantization algorithm in the exported TensorFlow model would make it more clumsy.
I wish to read the min and max of each tensor and chose the scale accordingly.
Is there a way I can just quantize the weights stored in .dat files using the NNEF tool? Many thanks in advance.
NNEF itself is a storage format, not a toolset for manipulating models. The tools only convert models to store them in NNEF format, they do not manipulate models or do training for you. If you want quantized networks first you have to train them that way, or apply post-training quantization, and then convert them to NNEF format.
Possibly in the future, we will introduce tools for manipulating models, such as post-training quantization, but that work is not yet done.
It is not clear to us what exactly you want to achieve, can you elaborate the process you would like?
Any notes on this? Can it be closed?
Do we need to define a shape inference function as well if we define a custom fragment in the graph.nnef file? I am getting the following error trying to define a custom quantization fragment.
code fragment
stack trace
Many thanks in advance!