Closed jnorwood closed 5 years ago
I tested the four quantized inception nnef networks which were translated from tflite. One thing that was interesting was that their concat operations also do rescaling, so that the output tensor has the same scale of the largest of the range of the inputs. One of the inception models had concats of 2, 3 and 4 inputs.
The latest version of the converter tools in the Khronos repo support bidirectional conversion between TFLite and NNEF. Let me know if you have a chance to test it and if it works as needed for your use cases.
I only see tflite_to_tf_py and tf_py_to_tflite, and no quantized operations supported in either. My needs were to convert all the tflite quantized models to nnef graph and quant files and the associated weight and bias files.
If you run the main convert.py with tflite as input format and give it quantized TFLite files, it will convert them to quantized weights and quant files.
The tflite_to_tf_py and tf_py_to_tflite are only helpers, conversion actually happens 'through TF' (like converting TFLite to TF first (keeping the quant info), then to NNEF).
https://www.tensorflow.org/api_docs/cc/group/nn-ops
Tensorflow has about 12 quantized ops, so I assumed you would be converting to tensorflow quantized ops from tflite ops with quantized data. I didn't see any quantized ops in your conversion files.
I'll give it a try later.
It's not actually converting to TF, it's just renaming and reparameterizing the ops in the in-memory structure so that the tf_to_nnef converter can be invoked.
In terms if parameterization there is no difference between quantized and non-quantised ops in TF, it's the actual implementation that is different (quantized ops do integer computation, which of course requires quantized tensors as inputs), but that is not relevant from the viewpoint of conversion.
The parameters can be the same for nnef. For the generated code I pass in downscale multipliers as parameters to the operations ... the info derived from the quant file. Maybe that quantization downscale should be treated as a separate op.
I don't understand why it should be treated as a separate op. The actual info is present in the quant file. But you can translate that to calls in your code that have the downscale as parameter to the ops, it's an implementation detail. So is there something missing in order for you to do that?
I tried to install, but the python install script gets errors with gcc and ubunut 18.04, for example:
include/flat/../common/parser.h:121:64: error: ‘std::function’ has not been declared static extensions_t readExtensions( Lexer& lexer, std::function<bool( const std::string& )> handler ) ^~~~~~~~
I'm attaching the errors. There are several.
err_nnef.zip
All errors seem to stem from the same missing functional header include, I added it, can you retry?
yes, that worked for gcc. I didn't try clang
I am using clang so it should be good
should there be a dependency for flatbuffers in the converters? I didn't see it listed, but tflite uses it.
Yes, that's true, it should be included in the list
Added that to the readme
I see a way to run the activation tests with:
python -m unittest discover -s 'tests/activation' -p '*layer_test_cases.py'
but I didn't see an instruction for the conversion utillities. This seems to work:
python -m unittest discover -s 'tests/functional' -p '*test.py'
I got back a result ... ran 30 tests
I had many failures running the activation tests. Initially I had tensorflow installed, but after the failures I installed tensorflow-gpu, but then get failures for missing libcuda.
Is all this really necessary? Seems much more complicated than what it takes to just read the pb or tlite file, which only requires the .pb or .tflite file schemas and protobuf or flatbuffers.
Well, TF is like that, installing the GPU version is more complicated because of cuda and cuDNN dependencies. If you want to run unit tests, you need this, because it tests whether the conversion results in functionally equivalent converted networks, and for that you have to run the networks in sample inputs.
But for the conversion itself, you don't need TF installed, it is enough what's in the repo, which contains .pb and .tflite schemas as you say.
So why are you running the activations tests? It is useful when you want to develop a new feature into the converter.
I'm running the only tests that were described in the README. If there are different dependency requirements for just doing conversions, then it would be helpful if those were described separately in the README, as well as describing how to run the the conversion tests separately.
The activation tests are the only test cases we have right now. You cannot really test conversion otherwise properly. You could test that some conversions run, but those would not really test if they are correct, although it might be useful to some degree.
I believe the readme describes dependency requirements for all tools quite clearly. It tells what is required for all tools, what is required if you want to run tests, and if you don't want that, you don't need those dependencies obviously.
Hi! I have just read the conversation and I think it might be helpful to share a full tflite to quantized nnef conversion that I have just done.
As Viktor have mentioned, tensorflow should not be needed for the conversion, but I have just recognized that there is in fact an unwanted dependency now. We will remove the dependency early next week.
pip install typing numpy flatbuffers six
pip install tensorflow # sorry it will be removed (no gpu needed for this)
git clone https://github.com/KhronosGroup/NNEF-Tools.git
cd NNEF-Tools/parser/python/
python setup.py install
cd ../..
wget download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_0.25_128_quant.tgz
tar xzvf mobilenet_v1_0.25_128_quant.tgz
./nnef_tools/convert.py --input-framework tensorflow-lite --output-framework nnef --input-model mobilenet_v1_0.25_128_quant.tflite
ls convert.out/model
It works for me on Ubuntu 16.04 LTS without a problem.
Kind regards, Tamás Danyluk
But for the conversion itself, you don't need TF installed
I found that you do need TF installed with the current code while just doing conversions due to this inclusion:
requires tensorflow File "/home/jay/nnef8/NNEF-Tools/nnef_tools/io/tensorflow/tf_py/tf_py_definitions.py", line 21, in <module> import tensorflow as tf
here are some more notes... I used the attached cmake file to download around 10 models, run the conversions, then run them through the parse validator. They all passed.
I'm using anaconda python3 installation. I installed these: `conda install six typing
conda install -c conda-forge python-flatbuffers
conda install tensorflow
conda install -c conda-forge onnx
` one thing that I couldn't resolve is that the cmake can't be run from an external folder. This is because the conversion scripts use some local module import paths. I couldn't find an install script for the converters. anaconda doesn't want you to use environment variables. So I ran cmake from the NNEF-Tools folder .
Another thing is that I didn't get the cmake Fetch operations to work with the onnx url, so I downloade its test case using wget. I searched and couldn't find a resolution for that...maybe just run wget as a command.
wget https://s3.amazonaws.com/download.onnx/models/opset_9/resnet50.tar.gz
The binary file support for quantized values, described in the spec, looks pretty good, and I see handling of quantization in the nnef tensorflow exporter.
https://github.com/KhronosGroup/NNEF-Tools/blob/master/converter/tensorflow/src/tf2nnef.py
However, the sample doesn't cover the case for quantized data. https://github.com/KhronosGroup/NNEF-Tools/blob/master/converter/tensorflow/src/sample_export.py
I'm wondering if there are any additional steps required to specify the export format for quantized data, or if there are any built-in limitations for that type of export, since I will need to be using this soon. Thanks.