tensorflow / models

Models and examples built with TensorFlow
Other
77.18k stars 45.75k forks source link

No OpKernel was registered to support Op 'TPUOrdinalSelector' used by node TPUOrdinalSelector #7501

Open Jconn opened 5 years ago

Jconn commented 5 years ago

System information

I have trained a model on my GPU and want to export the model to a TPU for inference.

I am running the script found at

models/research/object_detection/tpu_exporters/export_saved_model_tpu.py

When I run the below command: python3 object_detection/tpu_exporters/export_saved_model_tpu.py --pipeline_config_file=object_detection/output/models/model/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config --ckpt_path=object_detection/output/models/model/model.ckpt-39688 --export_dir=/tmp/out --input_type="image_tensor" --input_placeholder_name=image_tensor:0 I get the following exception:

With the full exception being 
Traceback (most recent call last):
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1348, in _run_fn
    self._extend_graph()
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1388, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'TPUOrdinalSelector' used by {{node TPUOrdinalSelector}}with these attrs: []
Registered devices: [CPU, GPU]
Registered kernels:
  <no registered kernels>

     [[TPUOrdinalSelector]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "object_detection/tpu_exporters/export_saved_model_tpu.py", line 54, in <module>
    tf.app.run()
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/johnconn/.local/lib/python3.6/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/home/johnconn/.local/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "object_detection/tpu_exporters/export_saved_model_tpu.py", line 47, in main
    FLAGS.input_type, FLAGS.use_bfloat16)
  File "/home/johnconn/utils/models/research/object_detection/tpu_exporters/export_saved_model_tpu_lib.py", line 80, in export
    sess.run(init_op)
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'TPUOrdinalSelector' used by node TPUOrdinalSelector (defined at /home/johnconn/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) with these attrs: []
Registered devices: [CPU, GPU]
Registered kernels:
  <no registered kernels>

     [[TPUOrdinalSelector]]

It looks like @shreyaaggarwal encountered the same issue, posting it here https://github.com/tensorflow/models/issues/4283

saberkun commented 5 years ago

What's the TF version you use? This feature is available in 1.14 and later versions.

Jconn commented 5 years ago

I am using 1.14.0

I've been doing transfer learning with ssd_resnet_50_fpn_coco as the base, using the object detection api model_main.py script.

Edit:

I tried this on my local computer and on a google collab notebook with TPU runtime enabled. No success with either environment.

bourdakos1 commented 4 years ago

I'm getting the same error, with TF 1.14 and 1.15

nkise commented 4 years ago

Same issue. TF 1.14, 1.15, tf-nightly

bourdakos1 commented 4 years ago

I was never able to get the saved_model_tpu.py working. However, I was able to get the model working on a coral edge tpu by exporting a tflite model using the following:

python object_detection/export_tflite_ssd_graph.py \
  --pipeline_config_path=$PIPELINE_CONFIG_PATH \
  --trained_checkpoint_prefix=$TRAINED_CHECKPOINT_PREFIX \
  --output_directory=$OUTPUT_DIRECTORY \
  --add_postprocessing_op=true

tflite_convert \
  --output_file=$OUTPUT_FILE \
  --graph_def_file=$GRAPH_DEF_FILE \
  --inference_type=QUANTIZED_UINT8 \
  --input_arrays="normalized_input_image_tensor" \
  --output_arrays="TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3" \
  --mean_values=128 \
  --std_dev_values=128 \
  --input_shapes=1,300,300,3 \
  --change_concat_input_ranges=false \
  --allow_nudging_weights_to_use_fast_gemm_kernel=true \
  --allow_custom_ops

then compiling the model with: https://coral.ai/docs/edgetpu/compiler/

nkise commented 4 years ago

I was never able to get the saved_model_tpu.py working. However, I was able to get the model working on a coral edge tpu by exporting a tflite model using the following:

python object_detection/export_tflite_ssd_graph.py \
  --pipeline_config_path=$PIPELINE_CONFIG_PATH \
  --trained_checkpoint_prefix=$TRAINED_CHECKPOINT_PREFIX \
  --output_directory=$OUTPUT_DIRECTORY \
  --add_postprocessing_op=true

tflite_convert \
  --output_file=$OUTPUT_FILE \
  --graph_def_file=$GRAPH_DEF_FILE \
  --inference_type=QUANTIZED_UINT8 \
  --input_arrays="normalized_input_image_tensor" \
  --output_arrays="TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3" \
  --mean_values=128 \
  --std_dev_values=128 \
  --input_shapes=1,300,300,3 \
  --change_concat_input_ranges=false \
  --allow_nudging_weights_to_use_fast_gemm_kernel=true \
  --allow_custom_ops

then compiling the model with: https://coral.ai/docs/edgetpu/compiler/

@bourdakos1 Whats model you were exporting?

bourdakos1 commented 4 years ago

@nkise ssd mobilenet v1

nkise commented 4 years ago

@bourdakos1 Thanks! What Tf version have you used?

bourdakos1 commented 4 years ago

I haven’t thoroughly tested both, but 1.14 and 1.15 should both work