Open simonmaurer opened 2 years ago
The CLI flag use_xnnpack
of lce_benchmark_model
basically decides which OpResolver to use: when use_xnnpack=false
then it uses BuiltinOpResolverWithoutDefaultDelegates
, otherwise it uses BuiltinOpResolver
. The op resolver has a GetDelegates()
method, and the BuiltinOpResolver
will choose the XNNPACK delegate when it is enabled via bazel build options, and by default they enable XNNPACK.
So the simplest would be to add a use_xnnpack
bool argument to the Python interpreter and then on this line choose the BuiltinOpResolverWithoutDefaultDelegates
whenever use_xnnpack == false
:
https://github.com/larq/compute-engine/blob/7db969ff13baf63f5c78ef32cb2eff5ac35e1947/larq_compute_engine/tflite/python/interpreter_wrapper_lite.cc#L40
To have more control over the options passed to XNNPACK (for example in the TF master branch we can enable/disable int8 kernels through these options), it seems that the easiest way is to follow the instructions at "Enable XNNPACK via low level delegate API". I haven't tried it, but I would think the best way is to always use BuiltinOpResolverWithoutDefaultDelegates
and then use that 'low level delegate API' to set the desired XNNPACK options whenever use_xnnpack == true
.
The LCE interpreter from
lce.testing.Interpreter
is a standalone class and exposes different properties of the quantized model (scale and zero-point for example). the converter on the other hand is built upon the two methodsconvert_keras_model
/convert_saved_model
. can you elaborate on your design decision ?boiling down to follow-up questions:
use_xnnpack=True
as in the cmd line parameters oflce_benchmark_model
)tf.lite.OpsSet.SELECT_TF_OPS
(as in regular TFLite)could you give me a hint on this ?