google / automl

Google Brain AutoML
Apache License 2.0
6.22k stars 1.45k forks source link

Unable run evaluation on TFLite - Shapes of all inputs must match #1171

Closed oliviawindsir closed 2 years ago

oliviawindsir commented 2 years ago

I wanted to try quantize efficientdet-lite2 using this autoML repo. In one of the issue, I saw people recommending to run via the notebook in efficientdet/tf2/tutorial.ipynb. I could not run it right there and then in that level. So what I did is to make a copy of the notebook out of the folder at efficientdet/ level. Most of the steps run perfectly fine until the step to evaluate the quantized tflite model in section 2.2.

The snippet of cells that I was trying to run was shown below.

# Evalute on validation set (takes about 10 mins for efficientdet-d0)
!python -m tf2.eval_tflite  \
    --model_name={MODEL}  --tflite_path={saved_model_dir}/int8.tflite \
    --val_file_pattern=tfrecord/val* \
    --val_json_file=annotations/instances_val2017.json --eval_samples=10

When running the evaluation part in the notebook above, I got the following error:

# Evalute on validation set (takes about 10 mins for efficientdet-d0)

!python -m tf2.eval_tflite  \

    --model_name={MODEL}  --tflite_path={saved_model_dir}/int8.tflite \

    --val_file_pattern=tfrecord/val* \

    --val_json_file=annotations/instances_val2017.json --eval_samples=10

2022-09-07 13:11:04.784592: I tensorflow/core/util/] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-09-07 13:11:06.602153: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-07 13:11:06.611957: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2022-09-07 13:11:06.611984: W tensorflow/core/common_runtime/gpu/] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-09-07 13:11:06.615605: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Traceback (most recent call last):
  File "/usr/lib/python3.8/", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/", line 87, in _run_code
    exec(code, run_globals)
  File "/home/local/github_repo/automl/efficientdet/tf2/", line 203, in <module>
  File "/home/local/github_repo/automl/venv/lib/python3.8/site-packages/absl/", line 308, in run
    _run_main(main, args)
  File "/home/local/github_repo/automl/venv/lib/python3.8/site-packages/absl/", line 254, in _run_main
  File "/home/local/github_repo/automl/efficientdet/tf2/", line 170, in main
    detections = postprocess.generate_detections_from_nms_output(
  File "/home/local/github_repo/automl/efficientdet/tf2/", line 527, in generate_detections_from_nms_output
    return tf.stack(detections_bs, axis=-1, name='detections')
  File "/home/local/github_repo/automl/venv/lib/python3.8/site-packages/tensorflow/python/util/", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/local/github_repo/automl/venv/lib/python3.8/site-packages/tensorflow/python/framework/", line 7164, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [1,25] != values[1].shape = [1,1] [Op:Pack] name: detections
oliviawindsir commented 2 years ago

Upon investigating further, I found out that there is a mismatched of return values from TFLite Interpreter. In this lite_runner, it reads 3 out of 4 parameters and did a post processing based on the output.

nms_boxes_bs, nms_classes_bs, nms_scores_bs, _ =

However, looking at the class LiteRunner and its run() function, it is actually returning the following output in sequence:

# TFLite model with post-processing.
      # Four Outputs:
      #   num_boxes: a float32 tensor of size 1 containing the number of
      #     detected boxes
      #   detection_scores: a float32 tensor of shape [1, num_boxes]
      #     with class scores
      #   detection_classes: a float32 tensor of shape [1, num_boxes]
      #     with class indices
      #   detection_boxes: a float32 tensor of shape [1, num_boxes, 4] with box
      #     locations

I did a print on the returned values just to confirm the above return structure and it proved to be true.


[[1.90625   1.7070312 1.6367188 1.5195312 1.5       1.421875  1.421875
  1.34375   1.328125  1.328125  1.2617188 1.2617188 1.2460938 1.2304688
  1.2304688 1.2148438 1.2148438 1.203125  1.203125  1.203125  1.1914062
  1.1914062 1.1914062 1.1796875 1.1796875]]

[[ 0. 50. 50.  0. 78. 50. 66. 78. 46. 63. 50. 46. 49. 79. 46. 49. 49. 49.
  50. 49. 49. 49. 50. 49. 49.]]

[[[ 1.0870631e-01  6.0856193e-01  5.3415084e-01  7.8300840e-01]
  [ 5.3346699e-01  5.0022423e-02  6.0073441e-01  1.5560755e-01]
  [ 4.5116577e-01  9.1615178e-02  5.1579547e-01  2.0761868e-01]
  [ 4.0700871e-01 -7.6876581e-04  4.8332697e-01  9.9255890e-02]
  [ 2.7915275e-01 -3.0338019e-03  4.5737642e-01  3.0552655e-01]
  [ 4.7963738e-01 -1.8516928e-04  5.6779981e-01  1.3329524e-01]
  [ 3.6036891e-01 -4.5481920e-03  6.5900868e-01  5.2189851e-01]
  [ 2.7603590e-01  3.6531299e-01  4.9422097e-01  6.1735934e-01]
  [ 4.1552711e-01  2.1859461e-01  4.7106594e-01  2.6934478e-01]
  [ 6.7321286e-03 -5.0023943e-04  2.3459271e-01  9.4630346e-02]
  [ 4.0802893e-01  3.9650986e-01  4.4082054e-01  4.5643839e-01]
  [ 4.2508441e-01  1.9213863e-01  4.7403681e-01  2.3016869e-01]
  [ 2.7977206e-02  4.4822890e-01  1.2727207e-01  4.7068286e-01]
  [ 2.5478679e-01  5.0784653e-01  3.2476139e-01  5.7413596e-01]
  [ 4.0729308e-01  3.4307864e-01  4.6384257e-01  4.0065721e-01]
  [ 8.4463120e-02  5.3181601e-01  1.7519653e-01  5.6020224e-01]
  [ 5.0127631e-01  1.9216484e-01  5.4273134e-01  2.6792631e-01]
  [ 7.4091956e-02  7.5158542e-01  1.4985339e-01  7.7173668e-01]
  [ 9.6471682e-02  9.7586125e-02  2.0205681e-01  1.9064963e-01]
  [ 6.7943789e-02  7.4670768e-01  2.1726370e-01  7.9601979e-01]
  [ 6.2376749e-02  4.6040615e-01  1.3951683e-01  4.8055735e-01]
  [ 7.5439975e-02  8.2979095e-01  2.1995910e-01  8.6491621e-01]
  [ 5.0147271e-01  1.2149759e-02  5.9189928e-01  1.5265068e-01]
  [ 6.7584962e-02  4.8720428e-01  1.1498937e-01  5.0176984e-01]
  [ 6.9879174e-02  8.0105811e-01  1.6226369e-01  8.2120937e-01]]]
fsx950223 commented 2 years ago
