ysh329 commented 4 years ago

这个【int8的精度】问题是【无法避免】的。即使是Server端加载int8模型CPU和GPU也可能会不同：这和乘加计算顺序、int8截断、四舍五入、硬件底层都有关系。

TensorFlow Model Optimization Toolkit — Post-Training Integer Quantization https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-post-training-integer-quantization-b4964a1ea9ba 这篇文章从上面截图开始看How models work

ysh329 commented 4 years ago

TensorFlow Lite 8-bit quantization specification | TensorFlow https://www.tensorflow.org/lite/performance/quantization_spec

TensorFlow Lite 8-bit quantization specification

Specification summary

We are providing a specification, and we can only provide some guarantees on behaviour if the spec is followed. We also understand different hardware may have preferences and restrictions that may cause slight deviations when implementing the spec that result in implementations that are not bit-exact. Whereas that may be acceptable in most cases (and we will provide a suite of tests that to the best of our knowledge include per-operation tolerances that we gathered from several models), the nature of machine learning (and deep learning in the most common case) makes it impossible to provide any hard guarantees.

ysh329 commented 4 years ago

针对0.4999和0.5000，在Github的TensorFlow仓库和对应stackOverFlow上没有检索到有用的相关信息：

Posts containing 'tensorflow quantization int8' - Stack Overflow https://stackoverflow.com/search?q=tensorflow+quantization+int8
Issues · tensorflow/tensorflow https://github.com/tensorflow/tensorflow/issues?utf8=%E2%9C%93&q=4999++5000

yuenshome / yuenshome.github.io

int8模型与不同硬件上的推理误差：0.4999和0.5000 #96

TensorFlow Lite 8-bit quantization specification

Specification summary