mlcommons / mobile_app_open

Mobile App Open
https://mlcommons.org/en/groups/inference-mobile/
Apache License 2.0
46 stars 22 forks source link

Preparing a super-resolution dataset #558

Closed freedomtan closed 1 year ago

freedomtan commented 2 years ago

We are planning to add an image super-resolution benchmark, so need a new dataset (in term of our code, see existing datasets here. Previously, SNU folks submitted a PR to use the ImagePairs dataset. Although, we decided not to go with ImagePairs, that PR could serve a starting point.

  1. The 25 pictures datasets: psnr.tar
  2. resolution of the dataset: 960x540 (scaled down) -> 1920x1080 (ground truth)
  3. the models to test with: edsr_zoo.tar

These models are subject to changes but it's OK to start with current ones.

freedomtan commented 2 years ago

Tested on x86 Ubuntu 22.04 machine, Pixel 6, and MTK devices:

No warnings encountered during test.

No errors encountered during test. 2022-10-28 12:51:48.103809: I flutter/cpp/binary/main.cc:373] Accuracy: 33.44 dB


* pixel 6
```shell
$ adb -t 1 shell /data/local/tmp/sr/main_sr external snusr --mode=AccuracyOnly --output_dir=/data/local/tmp/sr_output --model_file=/data/local/tmp/edsr/tflite/pl_f32b5.tflite --images_directory=/data/local/tmp/sr/dataset/LR_jpg_bicubic  --ground_truth_directory=/data/local/tmp/sr/dataset/HR_jpg --lib_path=/data/local/tmp/sr/libtflitebackend.so 
can't determine number of CPU cores: assuming 4
native : cpp/binary/main.cc:133 Using External backend
native : cpp/backends/external.cc:194 Using default allocator
native : cpp/backend_tflite/tflite_c.cc:187 TFLite backend matches hardware
native : cpp/backends/external.cc:194 Using default allocator
INFO: Initialized TensorFlow Lite runtime.
native : cpp/binary/main.cc:316 Using SNU SR dataset

No warnings encountered during test.

No errors encountered during test.
native : cpp/binary/main.cc:373 Accuracy: 33.45 dB
freedomtan commented 2 years ago

Script I used to convert fp32 tflite models,

for a in pl_f28b5 pl_f28b6 pl_f28b7 pl_f32b5 pl_f32b7 pl_f32b7 pl_f64b3 pl_f64b5 pl_f64b7 pl_f64b16
do 
    python convert_edsr_with_fixed_sizes.py ${a}/ckpt/checkpoint tflite/${a}.tflite
done

The convert_edsr_with_fixed_sizes.py

import sys
import tensorflow as tf

model = tf.saved_model.load(sys.argv[1])
concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape([None, 540, 960, 3])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
tflite_model = converter.convert()

with open(sys.argv[2], 'wb') as f:
  f.write(tflite_model)
freedomtan commented 2 years ago

For preparing jpeg images,

  1. there are 1080x1920 ones.
    /tmp/dataset/HR$ file *.png
    03a53ed6ab408b9f.png:         PNG image data, 1080 x 1920, 8-bit/color RGB, non-interlaced
    04113e7d2f21171e.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    0a81990b49b6d7b6.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15042736837_7c559543c9_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15226222451_75d515f540_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15228944302_32a039bac3_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15228957152_2d1913b916_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15228958922_fcb304be22_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15638540417_3f5e8c4511_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15821842151_c5b2b43730_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    15825261882_74124816d1_o.png: PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    1b5c620731c9bc7e.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    21f45f37199b845d.png:         PNG image data, 1080 x 1920, 8-bit/color RGB, non-interlaced
    454a2e427283ae9c.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    4e50705a3aecb1be.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    4f2354a774539f6f.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    50004b537dea9680.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    51441d385bf0f35d.png:         PNG image data, 1080 x 1920, 8-bit/color RGB, non-interlaced
    692b7cd4cca2ee3c.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    94d3972541dd8a9b.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    b55ff3bd0805c9de.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    bf8aaa19b3a17e0d.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    dc979cc4b88ae026.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    e23576bde0240d2c.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced
    eab65173c670ac3d.png:         PNG image data, 1920 x 1080, 8-bit/color RGB, non-interlaced

We can easily rotate them with some tools. E.g., With ImageMagick, we can rotate them

convert 03a53ed6ab408b9f.png -rotate -90 03a53ed6ab408b9f.png
convert 21f45f37199b845d.png -rotate -90 21f45f37199b845d.png
convert 51441d385bf0f35d.png -rotate -90 51441d385bf0f35d.png
  1. Then we can convert them to jpeg files
    for a in *.png; do convert ${a} -quality 100 ../HR_jpg/`basename ${a} .png`".jpg"; done
  2. and 960x540 lower resolution jpeg files
    for a in *.png; do convert ${a} -quality 100 -filter Catrom -resize 960x540 ../LR_jpg/`basename ${a} .png`".jpg"; done
freedomtan commented 2 years ago

With model and images mentioned above and the dataset #574 PR

model estimated operations (G OPs) PSNR (dB)
pl_f28b5 84.66 33.38
pl_f28b6 99.33 33.43
pl_f28b7 114.01 33.46
pl_f32b5 109.89 33.44
pl_f32b6 129.05 33.49
pl_f32b7 148.21 33.52
pl_f64b3 276.91 33.50
pl_f64b5 429.99 33.64
pl_f64b7 583.07 33.73
pl_f64b16 1271.94 33.85
freedomtan commented 1 year ago

Converting models to Core ML .mlmodel

Conversion

Converting to a saved_model models to a Core ML .mlmodel is quite straightfoward. With following

for a in pl_f28b5 pl_f28b6 pl_f28b7 pl_f32b5 pl_f32b6 pl_f32b7 pl_f64b3 pl_f64b5 pl_f64b7 pl_f64b16
do 
    python convert_edsr_mlmodel.py ${a}/ckpt/checkpoint mlmodel/${a}.mlmodel
done

We can get mlmodel files in mlmodel direcoty. The convert_edsr_mlmodel.py is

import sys
import tensorflow as tf
import coremltools as ct
from coremltools.proto.FeatureTypes_pb2 import ArrayFeatureType

inputs = [ct.TensorType(shape=(1, 540, 960, 3))]
mlmodel = ct.convert(sys.argv[1], inputs=inputs, source="tensorflow")
spec = mlmodel.get_spec()

spec.description.output[0].type.multiArrayType.dataType = ArrayFeatureType.FLOAT32
spec.description.output[0].type.multiArrayType.shape.append(1)
spec.description.output[0].type.multiArrayType.shape.append(1080)
spec.description.output[0].type.multiArrayType.shape.append(1920)
spec.description.output[0].type.multiArrayType.shape.append(3)

model = ct.models.MLModel(spec)
model.save(sys.argv[2])

Verifying converted mlmodel files

On mac machines, we can verify if converted models work well by comparing the output tensors of running inference with tf saved_model and coreml mlmodel in Python.

import tensorflow as tf
import coremltools as ct

# load saved_model
saved_model = tf.saved_model.load('pl_f32b5/ckpt/checkpoint')
# load mlmodel
mlmodel = ct.models.MLModel('mlmodel/pl_f32b5.mlmodel')

# load a jpg 540x960x3 file, cast it to tf.float32, exapnd it to 4-d
image = tf.image.decode_image(tf.io.read_file('dataset/LR_jpg/03a53ed6ab408b9f.jpg'))
img = tf.expand_dims(tf.cast(image, tf.float32), 0)

# inference with saved_model and then round the results to tf.uint8
saved_model_results = tf.cast(tf.round(saved_model(img)), tf.uint8)

# inference with coreml mlmodel
data = {'input_1': img.numpy()}
mlmodel_results = tf.cast(tf.round(mlmodel.predict(data)['Identity']), tf.uint8)

print(tf.subtract(mlmodel_results, saved_model_results))

The results seem good for f32b5,

$ python  verifying_coreml_mlmodel.py 
tf.Tensor(
[[[[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   0]
   [  0   0   0]
   [  0   0   0]]

  [[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   1]
   [  0   1   0]
   [  0   0   0]]

  [[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   0]
   [  0   0   0]
   [  0   0   0]]

  ...

  [[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   0]
   [255   0   0]
   [  0   0   0]]

  [[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   0]
   [  0   0   0]
   [  1   0   0]]

  [[  0   0   0]
   [  0   0   0]
   [  0   0   0]
   ...
   [  0   0   0]
   [  0   0   0]
   [  0   0   0]]]], shape=(1, 1080, 1920, 3), dtype=uint8)

Running with //flutter/cpp/binary:main

unfortunately, we didn't get expected results (same or very close to tflite backend)

$ ./bazel-bin/flutter/cpp/binary/main external snusr --mode=AccuracyOnly --output_dir=/tmp/sr_output --model_file=/tmp/mlmodel/pl_f32b5.mlmodel  --images_directory=/tmp/dataset/LR_jpg --ground_truth_directory=/tmp/dataset/HR_jpg   --lib_path=bazel-bin/mobile_back_apple/cpp/backend_coreml/libcoremlbackend.so
2022-10-31 14:53:07.195356: I flutter/cpp/binary/main.cc:133] Using External backend
2022-10-31 14:53:07.198615: I flutter/cpp/backends/external.cc:194] Using default allocator
2022-10-31 14:53:07.200442: I flutter/cpp/backends/external.cc:194] Using default allocator
2022-10-31 14:53:09.141 main[91491:3413089] inputNames: (
    "input_1"
)
2022-10-31 14:53:09.141 main[91491:3413089] outputNames: (
    Identity
)
2022-10-31 14:53:09.141 main[91491:3413089] batchSize: 0
2022-10-31 14:53:09.141 main[91491:3413089] [CoreMLExecutor init]
2022-10-31 14:53:09.141547: I flutter/cpp/binary/main.cc:316] Using SNU SR dataset

No warnings encountered during test.

No errors encountered during test.
2022-10-31 14:53:13.368060: I flutter/cpp/binary/main.cc:373] Accuracy: inf dB

Something is wrong in the Core ML backend, @anhappdev

freedomtan commented 1 year ago

Notes on discussion with @Mostelk during engineering meeting the week of Nov 1st.

image

I saw message like

For model outputs containing unsupported operations which cannot be quantized, the `inference_output_type` attribute will default to the original type.

when trying to convert qat checkpoint without using PTQ.

It seems the QAT converter could not handle the range clipping part well. image

freedomtan commented 1 year ago

It turns out, it's possible to convert QAT saved_model checkpoints to tflite with TensorFlow 1.x converter

import sys
import tensorflow as tf

converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(sys.argv[1], input_shapes={"serving_default_input_1": (1, 540, 960, 3)})
converter.inference_type = tf.int8
converter.quantized_input_stats = {"serving_default_input_1" : (0, 255)}
converter.default_ranges_stats = (0, 255)
tflite_model = converter.convert()

with open(sys.argv[2], 'wb') as f:
  f.write(tflite_model)

And we can compare quantized int8 tflite model with the following script:

import tensorflow as tf
import coremltools as ct
import numpy as np

# load saved_model
saved_model = tf.saved_model.load('pl_f32b5/ckpt_qat/checkpoint')

# load a jpg 540x960x3 file, cast it to tf.float32, exapnd it to 4-d
image = tf.image.decode_image(tf.io.read_file('dataset/LR_jpg/03a53ed6ab408b9f.jpg'))
img = tf.expand_dims(tf.cast(image, tf.float32), 0)

# inference with saved_model and then round the results to tf.uint8
saved_model_results = tf.cast(tf.round(saved_model(img)), tf.uint8)

TFLITE_MODEL = 'tflite/pl_f32b5_fint8.tflite'

# Load TFLite model
interpreter = tf.lite.Interpreter(TFLITE_MODEL)

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Allocate tensors
interpreter.allocate_tensors()

# get np array
image_np = image.numpy()

# If the expected input type is int8 (quantized model), rescale data
input_type = input_details[0]['dtype']
if input_type == np.int8:
    input_scale, input_zero_point = input_details[0]['quantization']
    print("(input_scale, input_zero_point): ", input_scale, input_zero_point)
    image_np = (image_np / input_scale) + input_zero_point
    image_np = np.around(image_np)
    image_np = np.int8(image_np)

# expand dims into 4-d
image_np = np.expand_dims(image_np, axis=0)

# set image_np as input tensor
interpreter.set_tensor(input_details[0]['index'], image_np)

# Run inference
interpreter.invoke()

output = interpreter.get_tensor(output_details[0]['index'])

# If the output type is int8 (quantized model), rescale data
output_type = output_details[0]['dtype']
if output_type == np.int8:
    output_scale, output_zero_point = output_details[0]['quantization']
    print("(output_scale, output_zero_point): ", output_scale, output_zero_point)
    output = output_scale * (output.astype(np.float32) - output_zero_point)

print(np.uint8(output) - saved_model_results)\

# calculate psnr
gt_image = tf.expand_dims(tf.image.decode_image(tf.io.read_file('dataset/HR_jpg/03a53ed6ab408b9f.jpg')), 0)

print("psnr (saved_model):", tf.image.psnr(gt_image, saved_model_results, 255.0))
print("psnr (quant tflite):", tf.image.psnr(gt_image, np.uint8(output), 255.0))
(input_scale, input_zero_point):  1.0 -128
(output_scale, output_zero_point):  1.0 -128
tf.Tensor(
[[[[  0   0   0]
   [  0   0   1]
   [  0 255   1]
   ...
   [  0   0   0]
   [  1   1 255]
   [  0   1   0]]

  [[255   0   0]
   [  0   0   1]
   [  0   0   0]
   ...
   [  0 255 255]
   [  2   0   0]
   [  0   0   0]]

  [[  0   0   2]
   [  0   0   0]
   [  0   0   0]
   ...
   [  1   0   0]
   [  0   0   1]
   [  1   0   2]]

  ...

  [[  0 254   0]
   [255 255 255]
   [254 255   0]
   ...
   [  0   1   1]
   [  0   0 255]
   [255   0 254]]

  [[  0   0   2]
   [  0 255   0]
   [254 255   0]
   ...
   [  0   0   0]
   [  0   1   0]
   [  0   0   0]]

  [[  0   0   0]
   [  0 255   0]
   [255 255   0]
   ...
   [  0 255   2]
   [  0   0   2]
   [255   0   0]]]], shape=(1, 1080, 1920, 3), dtype=uint8)

psnr (saved_model): tf.Tensor([31.469837], shape=(1,), dtype=float32)
psnr (quant tflite): tf.Tensor([31.468544], shape=(1,), dtype=float32)

Yes, the difference between a QAT saved_model and a quantized tflite is larger than Core ML model (which is in fp) But the PSNR of the quantized tflite looks good enough.

freedomtan commented 1 year ago

Does the command line //flutter/cpp/binary:main work for quantized int8 tflite models converted from QAT checkpoints? Yes, after changing two lines of code, https://github.com/mlcommons/mobile_app_open/pull/574/commits/888b0bc585762c7a7e9f00d538fea491a23c8d3b,

model estimated operations (G OPs) fp32 PSNR (dB) quant int8 PSNR (db)
pl_f28b5 84.66 33.38 33.25
pl_f28b6 99.33 33.43 33.33
pl_f28b7 114.01 33.46 33.32
pl_f32b5 109.89 33.44 33.31
pl_f32b6 129.05 33.49 33.36
pl_f32b7 148.21 33.52 33.41
pl_f64b3 276.91 33.50 33.43
pl_f64b5 429.99 33.64 33.55
pl_f64b7 583.07 33.73 33.62
pl_f64b16 1271.94 33.85 33.74
freedomtan commented 1 year ago

with https://github.com/mlcommons/mobile_app_open/pull/574/commits/46028cb3c34203adf9d45d8bbc37933f9f136b72, I made running //flutter/cpp/binary:main with libcoremlbackend.so work expected. We have PSNR numbers either the same with or better than tflite interpreter x86_64 numbers.

model estimated operations (G OPs) fp32 PSNR (dB) CoreML PSNR on M1 (dB) quant int8 PSNR (db)
pl_f28b5 84.66 33.38 33.39 33.25
pl_f28b6 99.33 33.43 33.44 33.33
pl_f28b7 114.01 33.46 33.47 33.32
pl_f32b5 109.89 33.44 33.45 33.31
pl_f32b6 129.05 33.49 33.49 33.36
pl_f32b7 148.21 33.52 33.53 33.41
pl_f64b3 276.91 33.50 33.51 33.43
pl_f64b5 429.99 33.64 33.64 33.55
pl_f64b7 583.07 33.73 33.74 33.62
pl_f64b16 1271.94 33.85 33.86 33.74