google / automl

Google Brain AutoML
Apache License 2.0
6.25k stars 1.45k forks source link

cannot export to TFLite (from graph nor saved_model), even with TF2.2.0-rc4 or dev2.2.506 (missing stem conv2d in Bx) or - operand #0 for input image is tf.quint8, not right tensor type #366

Closed lessw2020 closed 4 years ago

lessw2020 commented 4 years ago

Was happy to see the latest checkin addressing the TFLite issue. However, I tested today under the new requirements of TF 2.2.0-rc4 and still hit the exact same issue as before (issue 341). Really appreciate revisiting this for a perm TFLite fix as need it for mobile as soon as possible.

Same error as before - I tested both D0 and D3 for reference:

Load and save D3, then run: !python model_inspect.py --runmode=saved_model --model_name=efficientdet-d3 \ --ckpt_path=efficientdet-d3 --saved_model_dir=savedmodeldir \ --tflite_path=efficientdet-d3.tflite

Error log: (note - pages and pages of optimizer did nothing...clipping that except for the end part. error is the same though regarding quint8:) 2020-05-06 19:20:56.833535: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_1128_true_29962 2020-05-06 19:20:56.833545: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833556: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833566: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_657_true_22426 2020-05-06 19:20:56.833577: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833588: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833598: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_120_true_13834 2020-05-06 19:20:56.833609: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833619: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833629: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_665_true_22554 2020-05-06 19:20:56.833640: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833650: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833669: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_80_true_13194 2020-05-06 19:20:56.833680: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833692: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833703: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_966_true_27370 2020-05-06 19:20:56.833714: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833725: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833736: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_463_true_19322 2020-05-06 19:20:56.833746: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833757: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833768: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: while_body_23 2020-05-06 19:20:56.833779: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: Graph size after: 38 nodes (0), 41 edges (0), time = 1.002ms. 2020-05-06 19:20:56.833790: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: Graph size after: 38 nodes (0), 41 edges (0), time = 0.813ms. 2020-05-06 19:20:56.833801: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_624_true_21898 2020-05-06 19:20:56.833811: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.002ms. 2020-05-06 19:20:56.833821: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833831: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_353_true_17562 2020-05-06 19:20:56.833842: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833851: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833861: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_658_true_22442 2020-05-06 19:20:56.833872: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833883: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:56.833894: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:797] Optimization results for grappler item: cond_1038_true_28522 2020-05-06 19:20:56.833905: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-05-06 19:20:56.833916: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:799] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-05-06 19:20:59.979140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.979589: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1 2020-05-06 19:20:59.979740: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session 2020-05-06 19:20:59.980223: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.980637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla P4 computeCapability: 6.1 coreClock: 1.1135GHz coreCount: 20 deviceMemorySize: 7.43GiB deviceMemoryBandwidth: 178.99GiB/s 2020-05-06 19:20:59.980688: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-05-06 19:20:59.980739: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-05-06 19:20:59.980758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-05-06 19:20:59.980779: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-05-06 19:20:59.980797: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-05-06 19:20:59.980814: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-05-06 19:20:59.980832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-05-06 19:20:59.980898: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.981255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.981583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-05-06 19:20:59.981630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-05-06 19:20:59.981645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2020-05-06 19:20:59.981657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2020-05-06 19:20:59.981745: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.982101: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:20:59.982414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6966 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:04.0, compute capability: 6.1) 2020-05-06 19:21:00.221950: E tensorflow/core/grappler/grappler_item_builder.cc:668] Init node efficientnet-b3/stem/conv2d/kernel/Assign doesn't exist in graph Traceback (most recent call last): File "model_inspect.py", line 485, in tf.app.run(main) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run _run_main(main, args) File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "model_inspect.py", line 479, in main trace_filename=FLAGS.trace_filename) File "model_inspect.py", line 432, in run_model self.export_saved_model(config_dict) File "model_inspect.py", line 150, in export_saved_model driver.export(self.saved_model_dir, tflite_path=self.tflite_path) File "/content/automl/efficientdet/inference.py", line 736, in export tflite_model = self.to_tflite(output_dir) File "/content/automl/efficientdet/inference.py", line 700, in to_tflite return converter.convert() File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py", line 1084, in convert converter_kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert.py", line 496, in toco_convert_impl enable_mlir_converter=enable_mlir_converter) File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert.py", line 227, in toco_convert_protos raise ConverterError("See console for info.\n%s\n%s\n" % (stdout, stderr)) tensorflow.lite.python.convert.ConverterError: See console for info. 2020-05-06 19:21:02.747044: W tensorflow/compiler/mlir/lite/python/graphdef_to_tfl_flatbuffer.cc:89] Ignored output_format. 2020-05-06 19:21:02.747101: W tensorflow/compiler/mlir/lite/python/graphdef_to_tfl_flatbuffer.cc:95] Ignored drop_control_dependency. 2020-05-06 19:21:03.119060: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F 2020-05-06 19:21:03.123495: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2000160000 Hz 2020-05-06 19:21:03.123708: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56094aa29640 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-05-06 19:21:03.123733: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-05-06 19:21:03.125628: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-05-06 19:21:03.197983: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.198473: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56094aa284c0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-05-06 19:21:03.198502: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P4, Compute Capability 6.1 2020-05-06 19:21:03.198659: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.198999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla P4 computeCapability: 6.1 coreClock: 1.1135GHz coreCount: 20 deviceMemorySize: 7.43GiB deviceMemoryBandwidth: 178.99GiB/s 2020-05-06 19:21:03.199272: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-05-06 19:21:03.200669: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-05-06 19:21:03.202162: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-05-06 19:21:03.202484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-05-06 19:21:03.203932: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-05-06 19:21:03.204629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-05-06 19:21:03.207683: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-05-06 19:21:03.207785: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.208159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.208476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 2020-05-06 19:21:03.208560: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-05-06 19:21:03.209367: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-05-06 19:21:03.209396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 2020-05-06 19:21:03.209406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N 2020-05-06 19:21:03.209562: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.209932: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-05-06 19:21:03.210246: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0. 2020-05-06 19:21:03.210283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6697 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:04.0, compute capability: 6.1) loc(callsite("strided_slice_1"("/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py":299:0) at callsite("/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py":324:0 at callsite("/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert_saved_model.py":198:0 at callsite("/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py":837:0 at callsite("/content/automl/efficientdet/inference.py":696:0 at callsite("/content/automl/efficientdet/inference.py":736:0 at callsite("model_inspect.py":150:0 at callsite("model_inspect.py":432:0 at callsite("model_inspect.py":479:0 at "/usr/local/lib/python3.6/dist-packages/absl/app.py":250:0)))))))))): error: 'tfl.strided_slice' op operand #0 must be tensor of 32-bit float or 32-bit integer or 64-bit integer or 8-bit integer or QI8 type or QUI8 type or 1-bit integer values, but got 'tensor<1x896x896x3x!tf.quint8>' Traceback (most recent call last): File "/usr/local/bin/toco_from_protos", line 8, in sys.exit(main()) File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 93, in main app.run(main=execute, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/usr/local/lib/python2.7/dist-packages/absl/app.py", line 300, in run _run_main(main, args) File "/usr/local/lib/python2.7/dist-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 56, in execute enable_mlir_converter) Exception: /usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py:299:3: error: 'tfl.strided_slice' op operand #0 must be tensor of 32-bit float or 32-bit integer or 64-bit integer or 8-bit integer or QI8 type or QUI8 type or 1-bit integer values, but got 'tensor<1x896x896x3x!tf.quint8>' return loader.load(sess, tags, import_scope, *saver_kwargs) ^ /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:324:7: note: called from return func(args, kwargs) ^ /usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert_saved_model.py:198:5: note: called from loader.load(sess, meta_graph.meta_info_def.tags, saved_model_dir) ^ /usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py:837:34: note: called from output_arrays, tag_set, signature_key) ^ /content/automl/efficientdet/inference.py:696:9: note: called from output_arrays=[self.signitures['prediction'].op.name]) ^ /content/automl/efficientdet/inference.py:736:7: note: called from tflite_model = self.to_tflite(output_dir) ^ model_inspect.py:150:5: note: called from driver.export(self.saved_model_dir, tflite_path=self.tflite_path) ^ model_inspect.py:432:9: note: called from self.export_saved_model(config_dict) ^ model_inspect.py:479:7: note: called from trace_filename=FLAGS.trace_filename) ^ /usr/local/lib/python3.6/dist-packages/absl/app.py:250:5: note: called from sys.exit(main(argv)) ^

I'm running on colab if that matters. Thanks!

lessw2020 commented 4 years ago

Note on a hunch, I upgraded to 2.2.0-dev506 - for D1 to TFLite, get this: 2020-05-06 20:58:36.688419: E tensorflow/core/grappler/grappler_item_builder.cc:668] Init node efficientnet-b1/stem/conv2d/kernel/Assign doesn't exist in graph

lessw2020 commented 4 years ago

same for D0: 2020-05-06 21:15:17.695161: E tensorflow/core/grappler/grappler_item_builder.cc:668] Init node efficientnet-b0/stem/conv2d/kernel/Assign doesn't exist in graph

mingxingtan commented 4 years ago

Could you try pip install tf-nightly? @lessw2020

mingxingtan commented 4 years ago

Let's use the previous issue: https://github.com/google/automl/issues/364