Open lgeiger opened 3 years ago
@abattery Thanks for taking a look! I am not sure why this issue was transferred to TF-MOT though, since it is not directly related to TF-MOT. The issue lies not in the Python code of this repo, but in the lacking support of fp16 in the core fake quantisation op of TensorFlow (in fact I am personally not even using TF-MOT for the training aware quantisation I was referring to above).
Hi @Xhark , do you think it should be handled by MOT team or TF core team?
System information
Describe the feature and the current behavior/state.
Currently
tf.quantization.fake_quant_*
ops do not supportfloat16
input. E.g.:would fail because there is no kernel implementation available for
float16
input on either GPU or CPU.Checkout this notebook for a full reproduction.
Will this change the current api? How?
This won't change the API.
Who will benefit with this feature?
The lack of
float16
support intf.quantization.fake_quant_*
ops prevents people doing training aware quantisation from using Keras mixed precision training. This means many performance optimisations are inaccessible for people requiring quantisation aware training. In our specific case it means that training withtf.quantization.fake_quant_*
ops in the graph is twice as slow as without, due to the need of casting activations back tofloat32
.Any Other info.
I think it should be fairly straight forward to add
float16
support to the fake quant functor as it is a native Eigen implementation.