tensorflow / models

Models and examples built with TensorFlow
Other
77.05k stars 45.77k forks source link

TFLite have not support SquaredDifference/TFLite_Detection_PostProcess? #5020

Closed aifollower closed 4 years ago

aifollower commented 6 years ago

Describe the problem

2018-08-07 13:21:07.345473: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1053] Converting unsupported operation: SquaredDifference 2018-08-07 13:21:07.361241: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1053] Converting unsupported operation: TFLite_Detection_PostProcess

tensorflowbutler commented 6 years ago

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks. What is the top-level directory of the model you are using Have I written custom code OS Platform and Distribution TensorFlow installed from TensorFlow version Bazel version CUDA/cuDNN version GPU model and memory Exact command to reproduce

achowdhery commented 6 years ago

Please provide the sample frozen graph for the model you are trying to convert and instructions to reproduce. The custom op TFLite_Detection_PostProcess does exist in latest Tensorflow code : tensorflow/tensorflow/contrib/lite/kernels/detection_postprocess.cc

kate-kate commented 6 years ago

@achowdhery Hi, I have same problem with TFLite_Detection_PostProcess running this command from here

bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
--inference_type=FLOAT \
--allow_custom_ops

I used ssd_mobilenet_v1_coco pretrained model. I created frozen graph with export_tflite_ssd_graph.py command. At first I've tried running toco command for quantized model, as it was described in article Training and serving a realtime mobile object detector and faced the same error. Thank you for help. Can provide my frozen graph also

achowdhery commented 6 years ago

@kate-kate The steps in the second part of the tutorial do not require SquaredDifference. The other custom op TFLite_Detection_PostProcess is already present in the Tensorflow code base. So, it will give a warning but will work properly to give the detect.tflite file.

What version of Tensorflow are you using?

kate-kate commented 6 years ago

@achowdhery I use 1.9.0 built from source I also checked the file responsible for this op - it is really present. But as I also noticed in #5019 even if the command creates tflite file, then in Android is throws an error "Cannot create interpreter: Didn't find custom op for name 'TFLite_Detection_PostProcess'". So that's why I've decided the problem was during conversion. Model from ml kit tutorial works well(

achowdhery commented 6 years ago

@kate-kate Please try tensorflow 1.10 or build tensorflow from head - the code was not released in tensorflow 1.9.0

kate-kate commented 6 years ago

@achowdhery tried with tensorflow 1.10 built from source Command output in console

INFO: Build completed successfully, 316 total actions
INFO: Running command line: bazel-bin/tensorflow/contrib/lite/toco/toco '--input_file=/Users/kate/python/codes2/tflite_res/tflite_graph.pb' '--output_file=/Users/kate/python/codes2/tflite_res/detect4.tflite' '--input_shapes=1,300,300,3' '--input_arrays=normalized_input_image_tensor' '--output_arrays=TFLite_DetectioINFO: Build completed successfully, 316 total actions
2018-08-12 11:57:39.906797: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1055] Converting unsupported operation: TFLite_Detection_PostProcess
...

Error while running on Android (via Firebase ML Kit)

08-12 12:10:05.891 10381-10381/com.google.firebase.codelab.mlkit_custommodel W/System.err: Caused by: com.google.firebase.ml.common.FirebaseMLException: Internal error has occurred when executing Firebase ML tasks
08-12 12:10:05.896 10381-10381/com.google.firebase.codelab.mlkit_custommodel W/System.err:     at com.google.android.gms.internal.firebase_ml.zzgp.zza(Unknown Source:15)
        ... 5 more
    Caused by: java.lang.IllegalArgumentException: Cannot create interpreter: Didn't find custom op for name 'TFLite_Detection_PostProcess'
    Registration failed.
        at org.tensorflow.lite.NativeInterpreterWrapper.createInterpreter(Native Method)
achowdhery commented 6 years ago

@kate-kate Thanks for updating us that there is some incompatibility with ML Kit atm. Are you able to try with bazel build and Android sdk/ndk ? We have made a Dockerfile available to make installation easier https://github.com/tensorflow/models/blob/master/research/object_detection/dockerfiles/android/Dockerfile

kate-kate commented 6 years ago

@achowdhery I've tried with bazel build and Android sdk/ndk. First, I've tried it without docker and faced an error like C++ compilation of rule '@double_conversion//:double-conversion' failed (Exit 1). Then I've decided to use your Dockerfile and after executing bazel build -c opt --local_resources 4096,4.0,1.0 -j 1 //tensorflow/examples/android:tensorflow_demo command that I found in this article it threw me an error

ERROR: /tensorflow/tensorflow/compiler/xla/service/cpu/BUILD:472:1: C++ compilation of rule '//tensorflow/compiler/xla/service/cpu:runtime_conv2d_mkl' failed (Exit 1)
In file included from tensorflow/compiler/xla/service/cpu/runtime_conv2d_mkl.cc:17:
In file included from ./tensorflow/compiler/xla/executable_run_options.h:20:
In file included from ./tensorflow/compiler/xla/types.h:22:
In file included from ./tensorflow/core/framework/numeric_types.h:20:
In file included from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:129:
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorBroadcasting.h:108:15: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
  bool nByOne = false, oneByN = false;

Then I've tried to run it again with --verbose_failures option and error description was

ERROR: /root/.cache/bazel/_bazel_root/68a62076e91007a7908bc42a32e4cff9/external/highwayhash/BUILD.bazel:23:1: C++ compilation of rule '@highwayhash//:arch_specific' failed (Exit 1): clang failed: error executing command
  (cd /root/.cache/bazel/_bazel_root/68a62076e91007a7908bc42a32e4cff9/execroot/org_tensorflow && \
  exec env - \
    ANDROID_BUILD_TOOLS_VERSION=27.0.3 \
    ANDROID_NDK_API_LEVEL=14 \
    ANDROID_NDK_HOME=/opt/android-ndk-r14b \
    ANDROID_SDK_API_LEVEL=27 \
    ANDROID_SDK_HOME=/opt/android-sdk-linux \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/android-sdk-linux/tools:/opt/android-sdk-linux/tools/bin:/opt/android-sdk-linux/platform-tools \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python \
    PYTHON_LIB_PATH=/usr/lib/python2.7/dist-packages \
    TF_DOWNLOAD_CLANG=0 \
    TF_NEED_CUDA=0 \
    TF_NEED_OPENCL_SYCL=0 \
  external/androidndk/ndk/toolchains/llvm/prebuilt/linux-x86_64/bin/clang -gcc-toolchain external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64 -fpic -ffunction-sections -funwind-tables -fstack-protector-strong -Wno-invalid-command-line-argument -Wno-unused-command-line-argument -no-canonical-prefixes -fno-integrated-as -target armv7-none-linux-androideabi '-march=armv7-a' '-mfloat-abi=softfp' '-mfpu=vfpv3-d16' -mthumb -Os -g -DNDEBUG -MD -MF bazel-out/android-armeabi-v7a-opt/bin/external/highwayhash/_objs/arch_specific/external/highwayhash/highwayhash/arch_specific.d '-frandom-seed=bazel-out/android-armeabi-v7a-opt/bin/external/highwayhash/_objs/arch_specific/external/highwayhash/highwayhash/arch_specific.o' -iquote external/highwayhash -iquote bazel-out/android-armeabi-v7a-opt/genfiles/external/highwayhash -iquote external/bazel_tools -iquote bazel-out/android-armeabi-v7a-opt/genfiles/external/bazel_tools '--sysroot=external/androidndk/ndk/platforms/android-14/arch-arm' -isystem external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include -isystem external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/libs/armeabi-v7a/include -isystem external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/backward -c external/highwayhash/highwayhash/arch_specific.cc -o bazel-out/android-armeabi-v7a-opt/bin/external/highwayhash/_objs/arch_specific/external/highwayhash/highwayhash/arch_specific.o)
In file included from external/highwayhash/highwayhash/arch_specific.cc:15:
external/highwayhash/highwayhash/arch_specific.h:115:20: warning: alias declarations are a C++11 extension [-Wc++11-extensions]
using TargetBits = unsigned;
                   ^
external/highwayhash/highwayhash/arch_specific.cc:37:14: error: use of undeclared identifier 'nullptr'
      return nullptr;  // zero, multiple, or unknown bits
             ^
1 warning and 1 error generated.
Target //tensorflow/examples/android:tensorflow_demo failed to build
jdduke commented 6 years ago

Hi @kate-kate, with your custom TFLite build, are you using these instructions for pointing MLKit to that build? The 1.10 TFLite release should be out shortly, and MLKit will likely migrate shortly after that.

jefby commented 6 years ago

@kate-kate You can git checkout r1.10, and i build with command

bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
--inference_type=FLOAT \
--allow_custom_ops

works

kate-kate commented 6 years ago

@jefby Can you please tell, what pretrained model configuration you used (or maybe it was custom model) and also - how did you retrieve you frozen graph? And also your OS, I will just try to make everything as you did

kate-kate commented 6 years ago

@jdduke Thank you for advice, but when I use the command from your link, this happens

ERROR: /Users/kate/tensorflow/tensorflow/contrib/lite/kernels/internal/BUILD:248:1: C++ compilation of rule '//tensorflow/contrib/lite/kernels/internal:quantization_util' failed (Exit 1)
In file included from tensorflow/contrib/lite/kernels/internal/quantization_util.cc:17:
In file included from external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/cmath:305:
In file included from external/androidndk/ndk/sources/android/support/include/math.h:32:
external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/math.h:1302:93: error: no member named 'log2f' in the global namespace
inline _LIBCPP_INLINE_VISIBILITY float       log2(float __lcpp_x) _NOEXCEPT       {return ::log2f(__lcpp_x);}
                                                                                          ~~^
external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/math.h:1303:93: error: no member named 'log2l' in the global namespace
inline _LIBCPP_INLINE_VISIBILITY long double log2(long double __lcpp_x) _NOEXCEPT {return ::log2l(__lcpp_x);}
                                                                                          ~~^
external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/math.h:1308:38: error: call to 'log2' is ambiguous
log2(_A1 __lcpp_x) _NOEXCEPT {return ::log2((double)__lcpp_x);}
                                     ^~~~~~
external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/math.h:1302:46: note: candidate function
inline _LIBCPP_INLINE_VISIBILITY float       log2(float __lcpp_x) _NOEXCEPT       {return ::log2f(__lcpp_x);}
                                             ^
external/androidndk/ndk/sources/cxx-stl/llvm-libc++/include/math.h:1303:46: note: candidate function
inline _LIBCPP_INLINE_VISIBILITY long double log2(long double __lcpp_x) _NOEXCEPT {return ::log2l(__lcpp_x);}
                                             ^
3 errors generated.
Target //tensorflow/contrib/lite/java:tensorflow-lite failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 48.455s, Critical Path: 14.15s
INFO: 21 processes: 21 local.
FAILED: Build did NOT complete successfully

😔

kate-kate commented 6 years ago

Am I missing something needs to be updated?

kate-kate commented 6 years ago

@jdduke After downgrading my ndk to r16b (there was a warning)

ERROR: /Users/kate/tensorflow/tensorflow/contrib/lite/kernels/internal/BUILD:368:1: C++ compilation of rule '//tensorflow/contrib/lite/kernels/internal:neon_tensor_utils' failed (Exit 1)
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:17:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/algorithm:62:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/stl_algo.h:66:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/random:38:
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1118:11: error: no member named 'log2' in the global namespace
  using ::log2;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1119:11: error: no member named 'log2f' in the global namespace
  using ::log2f;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1120:11: error: no member named 'log2l' in the global namespace
  using ::log2l;
        ~~^
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:17:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/algorithm:62:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/stl_algo.h:66:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/random:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/string:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/char_traits.h:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/postypes.h:40:
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:164:11: error: no member named 'vfwscanf' in the global namespace
  using ::vfwscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:170:11: error: no member named 'vswscanf' in the global namespace
  using ::vswscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:174:11: error: no member named 'vwscanf' in the global namespace
  using ::vwscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:191:11: error: no member named 'wcstof' in the global namespace
  using ::wcstof;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:280:14: error: no member named 'wcstof' in namespace 'std'
  using std::wcstof;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:283:14: error: no member named 'vfwscanf' in namespace 'std'; did you mean 'fwscanf'?
  using std::vfwscanf;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:148:11: note: 'fwscanf' declared here
  using ::fwscanf;
          ^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:286:14: error: no member named 'vswscanf' in namespace 'std'; did you mean 'swscanf'?
  using std::vswscanf;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:160:11: note: 'swscanf' declared here
  using ::swscanf;
          ^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:289:14: error: no member named 'vwscanf' in namespace 'std'
  using std::vwscanf;
        ~~~~~^
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:19:
./tensorflow/contrib/lite/builtin_op_data.h:131:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:134:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:180:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:183:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:222:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
5 warnings and 11 errors generated.
Target //tensorflow/contrib/lite/java:tensorflow-lite failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 24.643s, Critical Path: 16.34s
INFO: 6 processes: 6 local.
FAILED: Build did NOT complete successfully
jefby commented 6 years ago

@kate-kate I use the mobilenet_ssd_v1_cocoa pretrained model, and use command

    python object_detection/export_tflite_ssd_graph.py --pipeline_config_path $CONFIG_FILE  --trained_checkpoint_prefix $CHECKPOINT_PATH --output_directory /tmp/tflite/ --add_postprocessing_op=true

to frozen and get the model compatible with tflite,

    bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
    --input_file=$OUTPUT_DIR/tflite_graph.pb \
    --output_file=$OUTPUT_DIR/detect.tflite \
    --input_shapes=1,300,300,3 \
    --input_arrays=normalized_input_image_tensor \
    --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
    --inference_type=FLOAT \
    --allow_custom_ops

get the float version tflite, but when i use command

    bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
    --input_file=$OUTPUT_DIR/tflite_graph.pb \
    --output_file=$OUTPUT_DIR/detect.tflite \
    --input_shapes=1,300,300,3 \
    --input_arrays=normalized_input_image_tensor \
    --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
    --inference_type=QUANTIZED_UINT8 \
    --mean_values=128 \
    --std_values=128 \
    --change_concat_input_ranges=false \
    --allow_custom_ops

,i will get incorrect model, its size is zero. i don't know why.

jefby commented 6 years ago

@kate-kate

You can set your ndk api-level to 21

android_sdk_repository(
       name = "androidsdk",
       api_level = 23,
       build_tools_version = "27.0.2",
       path = "/Users/xxx/Library/Android/sdk",
    )

    # Android NDK r12b is recommended (higher may cause issues with Bazel)
    android_ndk_repository(
       name="androidndk",
       path="/Users/xxx/Library/Android/sdk/android-ndk-r16b",
       api_level=21
)
mmkkyy commented 6 years ago

Hi all, I've succeeded with exporting a .tflite model, but can't run the model since interpreter.allocate_tensors() failed.

Only if I turn add_postprocessing_op=false during running export_tflite_ssd_graph.py

tensorflow/contrib/lite/kernels/detection_postprocess.cc:146 NumOutputs(node) != 4 (1 != 4)Node number 115 (CUSTOM) failed to prepare.

Then I check whether TFLite_Detection_PostProcess is defined in ~/tensorflow/tensorflow/contrib/lite/kernels/register.cc, it shows

  AddCustom("Mfcc", tflite::ops::custom::Register_MFCC());
  AddCustom("AudioSpectrogram",
            tflite::ops::custom::Register_AUDIO_SPECTROGRAM());
  AddCustom("TFLite_Detection_PostProcess",
            tflite::ops::custom::Register_DETECTION_POSTPROCESS());

Any suggestion?

achowdhery commented 6 years ago

@mmkkyy Please check input values that they are normalized image tensor regarding interpreter.allocate_tensors() failure. I dont understand what the error is in second case. We expect 4 output nodes in .tflite file : line 146: TF_LITE_ENSURE_EQ(context, NumOutputs(node), 4); If the export script was run with the add_postprocessing_op=false, then why does your code use detection_postprocess.cc. The java app in latest version works with the postprocessing op.

mmkkyy commented 6 years ago

@achowdhery Sorry for not clear. I am just trying to inference in python and c++ code with the .tflite model. It works with add_postprocessing_op=false, but failed with add_postprocessing_op=true. So I was suspecting that TFLite_Detection_PostProcess was not defined, but it seems it was(to me).

mmkkyy commented 6 years ago

Solve it, just a mistake. I set --output_arrays='TFLite_Detection_PostProcess' instead --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' :(

zhyj3038 commented 6 years ago

same problem. I didn't know why. but suppose have mistake when build tensorflow 1.10 from source. The conver code and *.pb in attached.

toco \ --graph_def_file=./data_voc/log/tflite_graph.pb \ --input_arrays=normalized_input_image_tensor \ --output_file=./data_voc/log/optimized_graph.lite \ --input_shapes=1,300,300,3 \ --output_format=TFLITE \ --inference_type=QUANTIZED_UINT8 \ --mean_values=128 \ --std_dev_values=128 \ --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \ --allow_custom_ops

ssdlite_v2.zip

achowdhery commented 6 years ago

@zhyj3038 Did the TOCO command work for you ?

zhyj3038 commented 6 years ago

@achowdhery TOCO works for me in andorid image classify demo, nut not in the SSDlite object detection demo.

achowdhery commented 6 years ago

@zhyj3038 Did you get a TFLite file when you run the command above on the frozen graph?

zhyj3038 commented 6 years ago

@aifollower I have just got .tflite . The solution is we can not just use "toco " commend, but use "bazel run --config=opt tensorflow/contrib/lite/toco:toco" to produce th .tflite. Thanks very much !

achowdhery commented 6 years ago

@zhyj3038 Thanks for the update. Please let us know if this still an open issule

kate-kate commented 6 years ago

@achowdhery Hi again! I've decided to try following this article step by step, use the latest version of tensorflow, re-built it with all prerequisites (especially for Android NDK,SDK) and now even those things that worked well, didn't work. And I just don't know why

For example: I have successfully used object_detection_tutorial.ipynb script before. Today I've decided to check my re-trained with TPU and quantization model with it and it returned me NotFoundError: Op type not registered 'TFLite_Detection_PostProcess' (I will add the screen) I run python object_detection/builders/model_builder_test.py to check what is wrong. And it passed. Also I can't built Android tflite demo app, because it says that BUILD is not a directory. Any help would be appreciated.

2018-09-08 20 55 27
kate-kate commented 6 years ago

Some additional info: script still work with non-quantized models frozen graph, but fails with quantized. Also I pulled the recent changes for tensorflow/models project. Error haven't gone. Both frozen graphs were obtained with command

python object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true

The only difference is that working graph is for ssd_mobilenet_v1_coco and not working is for ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03 model

achowdhery commented 6 years ago

@kate-kate please check tensorflow version and also download the latest checkpoints - the checkpoint you are using ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03 is older, there is a new checkpoint ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_18 in the detection model zoo.

kate-kate commented 6 years ago

@achowdhery updated everything - same problem If you just use ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_18 tflite_graph.pb with object_detection_tutorial.ipynb - you should see same error

zishanahmed08 commented 6 years ago

following

CreateChance commented 6 years ago

same problem, following.

bhatia-manish commented 5 years ago

@jdduke After downgrading my ndk to r16b (there was a warning)

ERROR: /Users/kate/tensorflow/tensorflow/contrib/lite/kernels/internal/BUILD:368:1: C++ compilation of rule '//tensorflow/contrib/lite/kernels/internal:neon_tensor_utils' failed (Exit 1)
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:17:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/algorithm:62:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/stl_algo.h:66:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/random:38:
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1118:11: error: no member named 'log2' in the global namespace
  using ::log2;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1119:11: error: no member named 'log2f' in the global namespace
  using ::log2f;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cmath:1120:11: error: no member named 'log2l' in the global namespace
  using ::log2l;
        ~~^
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:17:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/algorithm:62:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/stl_algo.h:66:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/random:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/string:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/char_traits.h:40:
In file included from external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/postypes.h:40:
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:164:11: error: no member named 'vfwscanf' in the global namespace
  using ::vfwscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:170:11: error: no member named 'vswscanf' in the global namespace
  using ::vswscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:174:11: error: no member named 'vwscanf' in the global namespace
  using ::vwscanf;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:191:11: error: no member named 'wcstof' in the global namespace
  using ::wcstof;
        ~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:280:14: error: no member named 'wcstof' in namespace 'std'
  using std::wcstof;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:283:14: error: no member named 'vfwscanf' in namespace 'std'; did you mean 'fwscanf'?
  using std::vfwscanf;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:148:11: note: 'fwscanf' declared here
  using ::fwscanf;
          ^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:286:14: error: no member named 'vswscanf' in namespace 'std'; did you mean 'swscanf'?
  using std::vswscanf;
        ~~~~~^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:160:11: note: 'swscanf' declared here
  using ::swscanf;
          ^
external/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cwchar:289:14: error: no member named 'vwscanf' in namespace 'std'
  using std::vwscanf;
        ~~~~~^
In file included from tensorflow/contrib/lite/kernels/internal/reference/portable_tensor_utils.cc:19:
./tensorflow/contrib/lite/builtin_op_data.h:131:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:134:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:180:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:183:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
./tensorflow/contrib/lite/builtin_op_data.h:222:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
typedef struct {
        ^
5 warnings and 11 errors generated.
Target //tensorflow/contrib/lite/java:tensorflow-lite failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 24.643s, Critical Path: 16.34s
INFO: 6 processes: 6 local.
FAILED: Build did NOT complete successfully

how did you solve this thing I am getting the same error

achowdhery commented 5 years ago

@bhatia-manish Can you please provide exact Tensorflow versions used for conversion and deployment in Java NDK/SDK including the Android version values? This SquaredDifference op should not be there in the graph.

CharlesCCC commented 5 years ago

Ran into the same problem with ssd_mobilenet_v1_coco_2018_01_28 model.
Doesn't seeming have problem with the model provided from this instruction tho...

https://coral.withgoogle.com/tutorials/edgetpu-retrain-detection/

seems like something wrong with ssd_mobilenet_v1_coco_2018_01_28 ???

inakaaay commented 5 years ago

@kate-kate im having the same issue as yours, did you solve it already?

kate-kate commented 5 years ago

@inakaaay just now working on training model on latest ssd_mobilenet_v1_0.75_depth_quantized_300x300 seems for me the only decision

inakaaay commented 5 years ago

im getting an error with converting it to tflite saying

tensorflow.lite.python.convert.ConverterError: TOCO failed. See console for info.
2019-03-26 20:43:11.249097: I tensorflow/lite/toco/import_tensorflow.cc:1324] Converting unsupported operation: TFLite_Detection_PostProcess
2019-03-26 20:43:11.249781: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "CPU"') for unknown op: WrapDatasetVariant
2019-03-26 20:43:11.250150: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: WrapDatasetVariant
2019-03-26 20:43:11.250667: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "CPU"') for unknown op: UnwrapDatasetVariant
2019-03-26 20:43:11.251029: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: UnwrapDatasetVariant
2019-03-26 20:43:11.305885: F tensorflow/lite/toco/tooling_util.cc:905] Check failed: GetOpWithOutput(model, output_array) Specified output array "'TFLite_Detection_PostProcess'" is not produced by any op in this graph. Is it a typo? To silence this message, pass this flag:  allow_nonexistent_arrays

how did you figure it out @kate-kate ?

kate-kate commented 5 years ago

@inakaaay I haven't faced this exact error, but I have an idea that the problem is that Tensorflow doesn't see detection_postprocess.cc operation from /lite

apivovarov commented 5 years ago

Solve it, just a mistake. I set --output_arrays='TFLite_Detection_PostProcess' instead --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' :(

@mmkkyy How did you find these 3 additional output names?

I tried to run summarize_graph and it shows only one output

ubuntu:~/tensorflow$ bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph="/home/ubuntu/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/tflite_graph.pb"
Found 1 possible inputs: (name=normalized_input_image_tensor, type=float(1), shape=[1,300,300,3]) 
No variables spotted.
Found 1 possible outputs: (name=TFLite_Detection_PostProcess, op=TFLite_Detection_PostProcess) 
Found 6853145 (6.85M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 451 Const, 389 Identity, 105 Mul, 94 FakeQuantWithMinMaxVars, 70 Add, 35 Sub, 35 Relu6, 35 Rsqrt, 34 Conv2D, 25 Reshape, 13 DepthwiseConv2dNative, 12 BiasAdd, 2 ConcatV2, 1 TFLite_Detection_PostProcess, 1 Squeeze, 1 Sigmoid, 1 RealDiv, 1 Placeholder
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/home/ubuntu/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/tflite_graph.pb --show_flops --input_layer=normalized_input_image_tensor --input_layer_type=float --input_layer_shape=1,300,300,3 --output_layer=TFLite_Detection_PostProcess

Another question is why the first output name does not have :0, like TFLite_Detection_PostProcess:0?

apivovarov commented 5 years ago

I just found one issue with --target_ops parameter (added in version 1.13.1) If it is set to TFLITE_BUILTINS,SELECT_TF_OPS (if it includes SELECT_TF_OPS) then resulting tflite model will have incorrect output tensors shape. Instead of (1,10,4) it will be (1,0,4)

Better to skip this parameter or set it to default TFLITE_BUILTINS

if you try to run bad tflite file the error will be

{'name': 'normalized_input_image_tensor', 'quantization': (0.007874015718698502, 128), 'index': 260, 'shape': array([  1, 300, 300,   3], dtype=int32), 'dtype': <class 'numpy.uint8'>}
{'name': 'TFLite_Detection_PostProcess', 'quantization': (0.0, 0), 'index': 252, 'shape': array([1, 0, 4], dtype=int32), 'dtype': <class 'numpy.float32'>}
{'name': 'TFLite_Detection_PostProcess:1', 'quantization': (0.0, 0), 'index': 253, 'shape': array([1, 0], dtype=int32), 'dtype': <class 'numpy.float32'>}
{'name': 'TFLite_Detection_PostProcess:2', 'quantization': (0.0, 0), 'index': 254, 'shape': array([1, 0], dtype=int32), 'dtype': <class 'numpy.float32'>}
{'name': 'TFLite_Detection_PostProcess:3', 'quantization': (0.0, 0), 'index': 255, 'shape': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>}
Traceback (most recent call last):
  File "./run-tflite.py", line 82, in <module>
    boxes = ip.get_tensor(out_id0)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/lite/python/interpreter.py", line 216, in get_tensor
    return self._interpreter.GetTensor(tensor_index)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 139, in GetTensor
    return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_GetTensor(self, i)
ValueError: Invalid tensor size.
apivovarov commented 5 years ago

example of working command (tested in 1.11.0, 1.12.2 and 1.13.1)

tflite_convert \
--graph_def_file=/home/ubuntu/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/tflite_graph.pb \
--output_file=/tmp/foo.tflite \
--output_format=TFLITE \
--input_arrays=normalized_input_image_tensor \
--input_shapes=1,300,300,3 \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_dev_values=128 \
--output_arrays="TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3" \
--allow_custom_ops
kira-yarmish commented 5 years ago

Solve it, just a mistake. I set --output_arrays='TFLite_Detection_PostProcess' instead --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' :(

@mmkkyy How did you find these 3 additional output names?

I tried to run summarize_graph and it shows only one output

ubuntu:~/tensorflow$ bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph="/home/ubuntu/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/tflite_graph.pb"
Found 1 possible inputs: (name=normalized_input_image_tensor, type=float(1), shape=[1,300,300,3]) 
No variables spotted.
Found 1 possible outputs: (name=TFLite_Detection_PostProcess, op=TFLite_Detection_PostProcess) 
Found 6853145 (6.85M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 451 Const, 389 Identity, 105 Mul, 94 FakeQuantWithMinMaxVars, 70 Add, 35 Sub, 35 Relu6, 35 Rsqrt, 34 Conv2D, 25 Reshape, 13 DepthwiseConv2dNative, 12 BiasAdd, 2 ConcatV2, 1 TFLite_Detection_PostProcess, 1 Squeeze, 1 Sigmoid, 1 RealDiv, 1 Placeholder
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/home/ubuntu/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/tflite_graph.pb --show_flops --input_layer=normalized_input_image_tensor --input_layer_type=float --input_layer_shape=1,300,300,3 --output_layer=TFLite_Detection_PostProcess

Another question is why the first output name does not have :0, like TFLite_Detection_PostProcess:0?

did you try Netron for getting all output names? https://github.com/lutzroeder/netron

for training and converting to tflite I've used makeml tool, it works just in gui, without code. there is example https://github.com/makeml-app/MakeML-Nails

metalwhale commented 5 years ago

@mmkkyy How did you find these 3 additional output names?

I tried to run summarize_graph and it shows only one output

Please refer to this for more information.

zychen2016 commented 4 years ago

TF lite Gpu Delegate could not use TFLite_Detection_PostProcess Op!!

jdduke commented 4 years ago

TF lite Gpu Delegate could not use TFLite_Detection_PostProcess Op!!

This kind of unsupported GPU op is fine, as it's a processing op at the end of the graph. The bulk of the model backbone should run fine on the GPU.

google-ml-butler[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 4 years ago

Closing as stale. Please reopen if you'd like to work on this further.