tensorflow / models

Models and examples built with TensorFlow
Other
77.23k stars 45.75k forks source link

[Deeplab] The iou of unit8 tflite have a sharp decline #8684

Open shipeng-cv opened 4 years ago

shipeng-cv commented 4 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/deeplab

2. Describe the bug

The mIou of uint8 tflite have a sharp decline

3. Steps to reproduce

using Quantization-aware training on my own datasets. The iou of ckpt is 87.98. Convert ckpt to pb file, the iou of pb model is 87.84 Convert pb to tflite, ,Then, the iou of float tflite is 87.89 Convet pb to tflite, ,Then, the iou of float tflite is 54.6 the convert_tflite code is followed strictly "https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md" 'tflite_convert \ --graph_def_file=${OUTPUT_DIR}/frozen_inference_graph.pb \ --output_file=${OUTPUT_DIR}/frozen_inference_graph.tflite \ --output_format=TFLITE \ --input_shape=1,321,321,3 \ --input_arrays="MobilenetV2/MobilenetV2/input" \ --inference_type=QUANTIZED_UINT8 \ --inference_input_type=QUANTIZED_UINT8 \ --std_dev_values=128 \ --mean_values=128 \ --change_concat_input_ranges=true \ --output_arrays="ArgMax"'

4. Expected behavior

I want to know how to quantize the mobilenetv2 deeplab so that its iou has a small drop.

5. Additional context

None

6. System information

shipeng-cv commented 4 years ago

Can share the test code of quantized tflite?

lsabrinax commented 4 years ago

I have the same issue with @shipeng-cv and the difference is that I test tflite on voc2012 datsetset . My test result is below: on mobilenetv2_coco_voc_trainaug_8bit/frozen_infernce_graph.pb mode, the miou is 70.4 on mobilenetv2_coco_voc_trainaug_8bit/frozen_infernce_graph.tflite, the miou decline to 57.9 I doubt that this is because of tflite without image preprocessing and postprocessing you mentioned in https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md if image processing is the key, can you share the code of how to processing image.

and I have another question about the miou result you put in image are they tested on pb model or tflite model?

YknZhu commented 4 years ago

hmm this is a bit odd. could you share the converted model and eval script? ideally the converted tflite model should have similar results with quantize-aware trained TF counterparts.

lsabrinax commented 4 years ago

the converted model I used is in http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_8bit_2019_04_26.tar.gz I first inference by tflite model and save as jpg, code as below

import tensorflow as tf
import pdb
from PIL import Image
import numpy as np
import cv2
import os
from  core import preprocess_utils
tflite_path = 'pretrained_models/deeplabv3_mnv2_pascal_train_aug_8bit/frozen_inference_graph.tflite' 
test_dir = 'datasets/pascal_voc_seg/VOCdevkit/VOC2012/JPEGImages'
id_txt = 'datasets/pascal_voc_seg/VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt'
output = 'result/quanti/float_tflite'
if not os.path.exists(output):
    os.mkdir(output)

def create_pascal_label_colormap():
      """Creates a label colormap used in PASCAL VOC segmentation benchmark.
      Returns:
        A colormap for visualizing segmentation results.
      """
      colormap = np.zeros((256, 3), dtype=int)
      ind = np.arange(256, dtype=int)

      for shift in reversed(list(range(8))):
        for channel in range(3):
          colormap[:, channel] |= ((ind >>  channel)&1) << shift
        ind >>= 3

      return colormap

colormap = create_pascal_label_colormap()
interpreter = tf.lite.Interpreter(tflite_path)
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
all_jpgs = []
with open(id_txt, 'r') as f:
    all_lines = f.readlines()
    for line in all_lines:
        all_jpgs.append(line.strip('\n')+'.jpg')

for jpg_path in all_jpgs:
    print(jpg_path)
    img = Image.open(os.path.join(test_dir, jpg_path))
    w, h = img.size
    resize_ratio = 1.0 * 513 / max(w, h)
    target_size = (int(resize_ratio * w), int(resize_ratio * h))
    resize_img = img.convert('RGB').resize((513, 513), Image.ANTIALIAS)
    import pdb; pdb.set_trace()
    input_data = np.expand_dims(img, axis=0)
    interpreter.set_tensor(input_details[0]['index'], input_data)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]['index'])
    result = np.squeeze(output_data).astype('uint8')
    img_viz = colormap[result].astype('uint8')
    img_viz = Image.fromarray(img_viz).convert('RGB').resize((w,h), Image.ANTIALIAS)
    result = Image.fromarray(result).resize((w, h), Image.ANTIALIAS)
    result.save(os.path.join(output, jpg_path[:-4]+'.png'), 'PNG')
    img_viz.save(os.path.join(output, jpg_path[:-4]+'_v.jpg'))

the I compute miou by the saved jpg,code as below:

def per_class_iu(hist):
    return np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist))

def compute_mIoU(gt_dir, pred_dir):
    """
    Compute IoU given the predicted colorized images and 
    """
    num_classes = 21  # 类别数目,需要包含背景类
    print('Num classes', num_classes)

    hist = np.zeros((num_classes, num_classes))

    label_path_list = os.listdir(gt_dir)
    img_path_list = os.listdir(pred_dir)

    imgs_list = list(set(label_path_list) & set(img_path_list))
    #import pdb;pdb.set_trace()
    for img_name in imgs_list:
        pred = np.array(Image.open(join(pred_dir, img_name)))
        label = np.array(Image.open(join(gt_dir, img_name)))

        if len(label.flatten()) != len(pred.flatten()):
            print('Skipping %s' % img_name)
            continue
        #hist += fast_hist(label.flatten(), pred.flatten(), num_classes)
        hist += confusion_matrix(label.flatten(), pred.flatten(), list(range(num_classes)))
    mIoUs = per_class_iu(hist)
    mIoUs_no_background = mIoUs[1:]  # 去掉背景类

    for ind_class in range(num_classes - 1):  # 逐类别输出一下mIoU值
        print('===>' + str(ind_class)  + ':\t' + str(round(mIoUs_no_background[ind_class] * 100, 2)))

    print('===> mIoU: ' + str(round(np.nanmean(mIoUs) * 100, 2)))  # 在所有验证集图像上求所有类别平均的mIoU值,计算时忽略NaN值
    print('===> mIoU no background: ' + str(
        round(np.nanmean(mIoUs_no_background) * 100, 2)))  # 在所有验证集图像上求所有类别平均的mIoU值,计算时忽略NaN值

    return mIoUs, mIoUs_no_background

def main(args):
    compute_mIoU(args.gt_dir, args.pred_dir)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--gt_dir', default='datasets/pascal_voc_seg/VOCdevkit/VOC2012/SegmentationClassRaw/', type=str, help='directory which stores gt images')
    parser.add_argument('--pred_dir', default='result/quanti/', type=str, help='directory which stores pred images')
    args = parser.parse_args()
    main(args)

I do not know whether the way I compute miou is right.

lsabrinax commented 4 years ago

@YknZhu I replace the resize operation with pad operation like in your 'input_processor.py' and now the frozen_infernce_graph.pb result is the same as your result in https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md is 74.26 but the tflite model result is still low, it is 64.99. Can you help me?

YknZhu commented 4 years ago

ahha yes image preprocess would definitely affect model performance. Note that running frozen inference graph with fake quantization nodes wont produce exactly the same results as a quantized TFlite model (it just simulates the quantization process). So 10% miou drop is still a bit surprising.

Another correction on the input ranges: Deeplab models actually takes 127.5 / 127.5 as the mean / stddev for input, but tflite converter didn't support float mean during quantization mode. Now this can be supported by using the following:

with tf.Graph().as_default() as graph:
  tf.import_graph_def(QUANTIZED_GRAPH_DEF, name='')
  sess = tf.Session()

  img = graph.get_tensor_by_name('MobilenetV2/MobilenetV2/input:0')
  out = graph.get_tensor_by_name('ArgMax:0')
  converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
  converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
  input_arrays = converter.get_input_arrays()
  converter.quantized_input_stats = {input_arrays[0] : (127.5, 127.5)} 
  tflite_model = converter.convert()

This might further close the gaps between two inference modes.

lsabrinax commented 4 years ago

thanks for your reply @YknZhu as your suggestion, I convert from the QUANTIZED_GRAPH_DEF and re-test, I got the result 74.01, it is very close to pb mode. I want to know whether there are some methods that can directly convert pb to tflite, and make the tflite model's result close to the pb, because I want to deploy the model on Mobile Device.

YknZhu commented 4 years ago

This is great to know! I would suggest using the code block above and wrap it into a python script for pb -> tflite conversion. Will update the quantized md file accordingly :)

also it would be great to add a consistency check at the end of script (basically load an image, run pb and tflite model, and we should expect their output to be close)

lsabrinax commented 4 years ago

@YknZhu Thanks for your help, and I resolved this problem. I have another question to ask you. Because the tflite model didn't support float mean during quantization mode, if I convert mean/std value to 127(or 128) when Quantization-aware training, and then infer by tflite model with mean/std = 127(or 128), will this work? I think the mean/std value 127.5 is used to normalize the input data to -1 and 1, is there any other consideration? If so, will converting to 127 degrade the performance?

YknZhu commented 4 years ago

sure that would help.The reason to use 127.5 is definitely to normalize input data within [-1, 1], but using 127/128 (potentially followed by a clipping) would work too.

Aspirinkb commented 4 years ago

I have a similar issue when I use the pre-trained deeplab mobilenet v2 model on my own dataset. The model has an excellent performance before quantization but something strange happen when I quantize it.

The steps of training and quantization as following:

  1. fine-tune pre-trained deeplab mobilenet v2 model on my own dataset util get a good model (no quantization);
    1. Quantization-aware-Training the fine-tuned model which step 1 yield, which --quantize_delay_step=0;
    2. Export (frozen) the trained model that step 2 yield;
    3. convert the frozen model to tflite model.

The step 1 yield a model model.ckpt-44639, and its evaluation result is:

eval/miou_1.0_class_1[0.985856116]
eval/miou_1.0_class_0[0.989553392]
eval/miou_1.0_overall[0.987704754]

So I fine-tune the model.ckpt-44639 model use quantization-aware-training on step 2. But the QaT training not good as step 1:

I0713 09:29:18.843101 140054631462720 evaluation.py:198] Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-0
...
I0713 09:29:20.165850 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:29:20
eval/miou_1.0_class_0[0.989550352]
eval/miou_1.0_overall[0.987695694]
eval/miou_1.0_class_1[0.985841036]
INFO:tensorflow:Waiting for new checkpoint at /data/deeplab_mnv2_test2_quant/
I0713 09:34:18.944677 140054631462720 evaluation.py:189] Waiting for new checkpoint at /data/deeplab_mnv2_test2_quant/
INFO:tensorflow:Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-1807
I0713 09:48:54.085607 140054631462720 evaluation.py:198] Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-1807
...
I0713 09:48:55.554226 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:48:55
eval/miou_1.0_class_0[0.854455292]
eval/miou_1.0_overall[0.812029839]
eval/miou_1.0_class_1[0.769604385]
...
INFO:tensorflow:Found new checkpoint at /data/deeplab_mnv2_test2_quant/model.ckpt-2000
I0713 09:53:54.195559 140054631462720 evaluation.py:198] Found new checkpoint at 
I0713 09:53:55.498395 140054631462720 evaluation.py:450] Starting evaluation at 2020-07-13-09:53:55
eval/miou_1.0_class_0[0.863450944]
eval/miou_1.0_class_1[0.7861467]
eval/miou_1.0_overall[0.824798882]

As you can see, the miou of model.ckpt-0 is high as the model.ckpt-44639(step 1), but decreasing as QaT training. So I export the quantized model model.ckpt-0 on step 3.

But the segmentation result of quantized model model.ckpt-0 on my test video is very very bad compared to the non-quantization model model.ckpt-44639.

Could you give some suggestion please?

Training comdline as following:

YknZhu commented 4 years ago

Hi Aspirinkb@ this issue seems not related to tflite / frozen graph differences. could you open it as a new issue? Thanks!

BTW a quick look at the commandline, it seems in QaT it uses --initialize_last_layer=true. Then the final layers will be initialized to random values. This could explain why the exported model doesn't work well.

It is a bit unclear why eval job states step-0 checkpoint works the best. could you have a commandline of your eval job (for QaT) as well in the new issue? Thanks!

lsabrinax commented 4 years ago

@YknZhu I retrain the model only using the pascal train dataset from pretrained model trained on Imagenet, So the miou is about 60. And I do control tests by setting mean/std as 127.5 or 128 respectively. It seemed like that 128 may be more applicable for quantization. But it is weird that the 128 corresponding tflite model drops a lot, below is my experiment. image

YknZhu commented 4 years ago

@lsabrinax It is a bit interesting that 127/128.5 actually produce a large difference in the final performance. Could you check the MIOU curve during the quantization process? I would really expect minor performance variation by using these two values in quantization aware training.

BTW https://github.com/tensorflow/models/pull/8955 updates the instruction to generate tflite models (also include a consistency check). Could you use that to produce TFlite models? Thanks!