in tflite,when excute
import tensorflow as tf converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir) converter.post_training_quantize = True tflite_quantized_model = converter.convert() open("quantized_model.tflite", "wb").write(tflite_quantized_model)
i can get a 8bit model and then i can move is to mobile devices.
is there any docs about caffe2 like this?i can't find any example or tutorial about quant a model and excute on mobile devices.
what's more, how caffe2 and pytorch support 8bit ops ?
in tflite,when excute
import tensorflow as tf converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir) converter.post_training_quantize = True tflite_quantized_model = converter.convert() open("quantized_model.tflite", "wb").write(tflite_quantized_model)
i can get a 8bit model and then i can move is to mobile devices.is there any docs about caffe2 like this?i can't find any example or tutorial about quant a model and excute on mobile devices. what's more, how caffe2 and pytorch support 8bit ops ?