Open shipeng-cv opened 4 years ago
Can share the test code of quantized tflite?
I have the same issue with @shipeng-cv and the difference is that I test tflite on voc2012 datsetset . My test result is below: on mobilenetv2_coco_voc_trainaug_8bit/frozen_infernce_graph.pb mode, the miou is 70.4 on mobilenetv2_coco_voc_trainaug_8bit/frozen_infernce_graph.tflite, the miou decline to 57.9 I doubt that this is because of tflite without image preprocessing and postprocessing you mentioned in https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md if image processing is the key, can you share the code of how to processing image.
and I have another question about the miou result you put in are they tested on pb model or tflite model?
hmm this is a bit odd. could you share the converted model and eval script? ideally the converted tflite model should have similar results with quantize-aware trained TF counterparts.
the converted model I used is in http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_8bit_2019_04_26.tar.gz I first inference by tflite model and save as jpg, code as below
import tensorflow as tf
import pdb
from PIL import Image
import numpy as np
import cv2
import os
from core import preprocess_utils
tflite_path = 'pretrained_models/deeplabv3_mnv2_pascal_train_aug_8bit/frozen_inference_graph.tflite'
test_dir = 'datasets/pascal_voc_seg/VOCdevkit/VOC2012/JPEGImages'
id_txt = 'datasets/pascal_voc_seg/VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt'
output = 'result/quanti/float_tflite'
if not os.path.exists(output):
os.mkdir(output)
def create_pascal_label_colormap():
"""Creates a label colormap used in PASCAL VOC segmentation benchmark.
Returns:
A colormap for visualizing segmentation results.
"""
colormap = np.zeros((256, 3), dtype=int)
ind = np.arange(256, dtype=int)
for shift in reversed(list(range(8))):
for channel in range(3):
colormap[:, channel] |= ((ind >> channel)&1) << shift
ind >>= 3
return colormap
colormap = create_pascal_label_colormap()
interpreter = tf.lite.Interpreter(tflite_path)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
all_jpgs = []
with open(id_txt, 'r') as f:
all_lines = f.readlines()
for line in all_lines:
all_jpgs.append(line.strip('\n')+'.jpg')
for jpg_path in all_jpgs:
print(jpg_path)
img = Image.open(os.path.join(test_dir, jpg_path))
w, h = img.size
resize_ratio = 1.0 * 513 / max(w, h)
target_size = (int(resize_ratio * w), int(resize_ratio * h))
resize_img = img.convert('RGB').resize((513, 513), Image.ANTIALIAS)
import pdb; pdb.set_trace()
input_data = np.expand_dims(img, axis=0)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
result = np.squeeze(output_data).astype('uint8')
img_viz = colormap[result].astype('uint8')
img_viz = Image.fromarray(img_viz).convert('RGB').resize((w,h), Image.ANTIALIAS)
result = Image.fromarray(result).resize((w, h), Image.ANTIALIAS)
result.save(os.path.join(output, jpg_path[:-4]+'.png'), 'PNG')
img_viz.save(os.path.join(output, jpg_path[:-4]+'_v.jpg'))
the I compute miou by the saved jpg,code as below:
def per_class_iu(hist):
return np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist))
def compute_mIoU(gt_dir, pred_dir):
"""
Compute IoU given the predicted colorized images and
"""
num_classes = 21 # 类别数目,需要包含背景类
print('Num classes', num_classes)
hist = np.zeros((num_classes, num_classes))
label_path_list = os.listdir(gt_dir)
img_path_list = os.listdir(pred_dir)
imgs_list = list(set(label_path_list) & set(img_path_list))
#import pdb;pdb.set_trace()
for img_name in imgs_list:
pred = np.array(Image.open(join(pred_dir, img_name)))
label = np.array(Image.open(join(gt_dir, img_name)))
if len(label.flatten()) != len(pred.flatten()):
print('Skipping %s' % img_name)
continue
#hist += fast_hist(label.flatten(), pred.flatten(), num_classes)
hist += confusion_matrix(label.flatten(), pred.flatten(), list(range(num_classes)))
mIoUs = per_class_iu(hist)
mIoUs_no_background = mIoUs[1:] # 去掉背景类
for ind_class in range(num_classes - 1): # 逐类别输出一下mIoU值
print('===>' + str(ind_class) + ':\t' + str(round(mIoUs_no_background[ind_class] * 100, 2)))
print('===> mIoU: ' + str(round(np.nanmean(mIoUs) * 100, 2))) # 在所有验证集图像上求所有类别平均的mIoU值,计算时忽略NaN值
print('===> mIoU no background: ' + str(
round(np.nanmean(mIoUs_no_background) * 100, 2))) # 在所有验证集图像上求所有类别平均的mIoU值,计算时忽略NaN值
return mIoUs, mIoUs_no_background
def main(args):
compute_mIoU(args.gt_dir, args.pred_dir)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--gt_dir', default='datasets/pascal_voc_seg/VOCdevkit/VOC2012/SegmentationClassRaw/', type=str, help='directory which stores gt images')
parser.add_argument('--pred_dir', default='result/quanti/', type=str, help='directory which stores pred images')
args = parser.parse_args()
main(args)
I do not know whether the way I compute miou is right.
@YknZhu I replace the resize operation with pad operation like in your 'input_processor.py' and now the frozen_infernce_graph.pb result is the same as your result in https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md is 74.26 but the tflite model result is still low, it is 64.99. Can you help me?
ahha yes image preprocess would definitely affect model performance. Note that running frozen inference graph with fake quantization nodes wont produce exactly the same results as a quantized TFlite model (it just simulates the quantization process). So 10% miou drop is still a bit surprising.
Another correction on the input ranges: Deeplab models actually takes 127.5 / 127.5 as the mean / stddev for input, but tflite converter didn't support float mean during quantization mode. Now this can be supported by using the following:
with tf.Graph().as_default() as graph:
tf.import_graph_def(QUANTIZED_GRAPH_DEF, name='')
sess = tf.Session()
img = graph.get_tensor_by_name('MobilenetV2/MobilenetV2/input:0')
out = graph.get_tensor_by_name('ArgMax:0')
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
input_arrays = converter.get_input_arrays()
converter.quantized_input_stats = {input_arrays[0] : (127.5, 127.5)}
tflite_model = converter.convert()
This might further close the gaps between two inference modes.
thanks for your reply @YknZhu as your suggestion, I convert from the QUANTIZED_GRAPH_DEF and re-test, I got the result 74.01, it is very close to pb mode. I want to know whether there are some methods that can directly convert pb to tflite, and make the tflite model's result close to the pb, because I want to deploy the model on Mobile Device.
This is great to know! I would suggest using the code block above and wrap it into a python script for pb -> tflite conversion. Will update the quantized md file accordingly :)
also it would be great to add a consistency check at the end of script (basically load an image, run pb and tflite model, and we should expect their output to be close)
@YknZhu Thanks for your help, and I resolved this problem. I have another question to ask you. Because the tflite model didn't support float mean during quantization mode, if I convert mean/std value to 127(or 128) when Quantization-aware training, and then infer by tflite model with mean/std = 127(or 128), will this work? I think the mean/std value 127.5 is used to normalize the input data to -1 and 1, is there any other consideration? If so, will converting to 127 degrade the performance?
sure that would help.The reason to use 127.5 is definitely to normalize input data within [-1, 1], but using 127/128 (potentially followed by a clipping) would work too.
I have a similar issue when I use the pre-trained deeplab mobilenet v2
model on my own dataset. The model has an excellent performance before quantization but something strange happen when I quantize it.
The steps of training and quantization as following:
deeplab mobilenet v2
model on my own dataset util get a good model (no quantization);
--quantize_delay_step=0
;The step 1 yield a model model.ckpt-44639
, and its evaluation result is:
eval/miou_1.0_class_1[0.985856116]
eval/miou_1.0_class_0[0.989553392]
eval/miou_1.0_overall[0.987704754]
So I fine-tune the model.ckpt-44639
model use quantization-aware-training on step 2. But the QaT training not good as step 1:
I0713 09:29:18.843101 140054631462720 evaluation.py:198] Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-0
...
I0713 09:29:20.165850 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:29:20
eval/miou_1.0_class_0[0.989550352]
eval/miou_1.0_overall[0.987695694]
eval/miou_1.0_class_1[0.985841036]
INFO:tensorflow:Waiting for new checkpoint at /data/deeplab_mnv2_test2_quant/
I0713 09:34:18.944677 140054631462720 evaluation.py:189] Waiting for new checkpoint at /data/deeplab_mnv2_test2_quant/
INFO:tensorflow:Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-1807
I0713 09:48:54.085607 140054631462720 evaluation.py:198] Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-1807
...
I0713 09:48:55.554226 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:48:55
eval/miou_1.0_class_0[0.854455292]
eval/miou_1.0_overall[0.812029839]
eval/miou_1.0_class_1[0.769604385]
...
INFO:tensorflow:Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-2000
I0713 09:53:54.195559 140054631462720 evaluation.py:198] Found new checkpoint at
I0713 09:53:55.498395 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:53:55
eval/miou_1.0_class_0[0.863450944]
eval/miou_1.0_class_1[0.7861467]
eval/miou_1.0_overall[0.824798882]
As you can see, the miou of model.ckpt-0
is high as the model.ckpt-44639
(step 1), but decreasing as QaT training. So I export the quantized model model.ckpt-0
on step 3.
But the segmentation result of quantized model model.ckpt-0
on my test video is very very bad compared to the non-quantization model model.ckpt-44639
.
Could you give some suggestion please?
Training comdline as following:
python deeplab/train.py --logtostderr \
--save_summaries_images=true \
--min_scale_factor=0.9 max_scale_factor=1.1 \
--scale_factor_step_size=0.1 \
--dataset="smartcar" \
--dataset_dir=/data/dataset/tfrecord/ \
--train_crop_size="481,641" \
--train_batch_size=16 \
--tf_initial_checkpoint=/data/pretrained/deeplabv3_mnv2_cityscapes_train/model.ckpt \
--train_logdir=/data/deeplab_mnv2_test2 \
--decoder_output_stride="" \
--aspp_convs_filters=256 \
--model_variant=mobilenet_v2 \
--initialize_last_layer=false \
--last_layers_contain_logits_only=false \
--train_split="train" \
--training_number_of_steps=50000
python deeplab/export_model.py \
--checkpoint_path=/data/deeplab_mnv2_test2/model.ckpt-44639 \
--export_path=/data/deeplab_mnv2_test2/frozen/frozen_inference_graph.pb \
--num_classes=2 \
--crop_size=481 \
--crop_size=641
python deeplab/train.py \
--save_summaries_images=true \
--min_scale_factor=0.9 max_scale_factor=1.1 \
--scale_factor_step_size=0.1 \
--dataset="smartcar" \
--dataset_dir=/data/dataset/tfrecord/ \
--train_crop_size="481,641" \
--train_batch_size=16 \
--tf_initial_checkpoint=/data/deeplab_mnv2_test2/model.ckpt-44639 \
--train_logdir=/data/deeplab_mnv2_test2_quant/ \
--decoder_output_stride="" \
--aspp_convs_filters=256 \
--model_variant=mobilenet_v2 \
--initialize_last_layer=true \
--train_split="train" \
--training_number_of_steps=2000 \
--quantize_delay_step=0 \
--base_learning_rate=2e-5
python deeplab/export_model.py \
--checkpoint_path=/data/deeplab_mnv2_test2_quant/model.ckpt-0 \
--export_path=/data/deeplab_mnv2_test2_quant/frozen/frozen_inference_graph_quant.pb \
--num_classes=2 \
--crop_size=481 \
--crop_size=641 \
--quantize_delay_step=0
Hi Aspirinkb@ this issue seems not related to tflite / frozen graph differences. could you open it as a new issue? Thanks!
BTW a quick look at the commandline, it seems in QaT it uses --initialize_last_layer=true
. Then the final layers will be initialized to random values. This could explain why the exported model doesn't work well.
It is a bit unclear why eval job states step-0 checkpoint works the best. could you have a commandline of your eval job (for QaT) as well in the new issue? Thanks!
@YknZhu I retrain the model only using the pascal train dataset from pretrained model trained on Imagenet, So the miou is about 60. And I do control tests by setting mean/std as 127.5 or 128 respectively. It seemed like that 128 may be more applicable for quantization. But it is weird that the 128 corresponding tflite model drops a lot, below is my experiment.
@lsabrinax It is a bit interesting that 127/128.5 actually produce a large difference in the final performance. Could you check the MIOU curve during the quantization process? I would really expect minor performance variation by using these two values in quantization aware training.
BTW https://github.com/tensorflow/models/pull/8955 updates the instruction to generate tflite models (also include a consistency check). Could you use that to produce TFlite models? Thanks!
Prerequisites
Please answer the following questions for yourself before submitting an issue.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/research/deeplab
2. Describe the bug
The mIou of uint8 tflite have a sharp decline
3. Steps to reproduce
using Quantization-aware training on my own datasets. The iou of ckpt is 87.98. Convert ckpt to pb file, the iou of pb model is 87.84 Convert pb to tflite,,Then, the iou of float tflite is 87.89
Convet pb to tflite, ,Then, the iou of float tflite is 54.6
the convert_tflite code is followed strictly "https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md"
'tflite_convert \
--graph_def_file=${OUTPUT_DIR}/frozen_inference_graph.pb \
--output_file=${OUTPUT_DIR}/frozen_inference_graph.tflite \
--output_format=TFLITE \
--input_shape=1,321,321,3 \
--input_arrays="MobilenetV2/MobilenetV2/input" \
--inference_type=QUANTIZED_UINT8 \
--inference_input_type=QUANTIZED_UINT8 \
--std_dev_values=128 \
--mean_values=128 \
--change_concat_input_ranges=true \
--output_arrays="ArgMax"'
4. Expected behavior
I want to know how to quantize the mobilenetv2 deeplab so that its iou has a small drop.
5. Additional context
None
6. System information