tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
185.87k stars 74.23k forks source link

Failed to get registration from op code CUSTOM #57710

Closed ChicchiPhD closed 2 years ago

ChicchiPhD commented 2 years ago

Hi developers, hi all!

I'm making some experiments with LSTM networks using Keras, TensorFlow, and I'm working on a Colab environment, using TF 2.11.0-dev20220915 installed via pip.

On the model authoring side, I'm able to successfully build the model, apply compression techniques and get the equivalent Lite model, and the .cc file. Specifically, starting from the base model I apply the pruning technique first, and then fully integer quantization. During the conversion process to Lite model, I do not get any errors.

However, when it's time to flash the model onto the device (in this case, an ESP32), I always get the following error Failed to get registration from op code CUSTOM AllocateTensors() failed.

Is this error generated because there is something in the model not yet supported? is there a way to discover which is, or are, the OPs that are not supported? I really would like to understand the reason for this error and how could I solve it.

Here attached you can find a zip folder with two .tflite files (one it's related to the model not quantized yet and the other to the quantized one), the third one is the .cc file of the quantized model. material_pruning-fullyintquant.zip

mohantym commented 2 years ago

Hi @ChicchiPhD ! You can use netron app to see any operator which are not either supported in select ops or built in list. I found a SO thread with similar stack trace which suggest on porting lite to micro ops (perhaps you are using some flex ops)

You can also check Fusion code lab documentation on converting LSTM/RNN model which sometimes don't require select ops syntax.

Could you also attach the C++ inference code used in ESP 32 inference.

Thank you!

ChicchiPhD commented 2 years ago

Hi @mohantym,

thank you so much for your reply, and for pointing all the links out. I've actually realized that one of the sources of the problem was related to the fact that I had not a fixed input size in the model. I was able to repeat all tests, without the need to use tf.lite.OpsSet.SELECT_TF_OPS on model authoring side, during conversion.

Everything works fine, so, I'm able to apply the pruning technique first, fully integer quantization then, and obtain .tflite .cc files. But when I flash the model onto the device, I get a new error:

VAR_HANDLE requires resource variables. Please create ResourceVariables and pass it to the interpreter. Node VAR_HANDLE (number 1f) failed to prepare with status 1 Here attached you can find a zip folder containing the two tflite models (one of the pruned model but not quantized yet, the other one of the pruned and quantized model), and the two .cc file (one is related to the pruned model but not quantized yet, the other one of the pruned and quantized model). In both cases, I get the same error.

I've found this ticket https://github.com/tensorflow/tensorflow/issues/56932 related to the same problem. However, for me, the proposed solution did not work. Not only the proposed solution did not work, but the size of both .cc file increased a lot (around 7 MB), so that I was not even able to run them on the device. 1_pruning_fullyquant.zip

mohantym commented 2 years ago

Ok @ChicchiPhD ! Thanks for the update. Could you convert the model to TFlite again with below flag and let us know the inference result.

converter.experimental_enable_resource_variables = True # Ref

Thank you!

ChicchiPhD commented 2 years ago

Hi @mohantym,

if I try to convert the model to TFLite with that flag, I still get the same error

VAR_HANDLE requires resource variables. Please create ResourceVariables and pass it to the interpreter. Node VAR_HANDLE (number 1f) failed to prepare with status 1

However, I've done several attempts and I've noticed the following behaviors:

  1. whether I use the flag or not, if I apply pruning and then TFLite conversion, or pruning, followed by fully integer quantization and then I apply TFLite conversion, onto the device I always get the VAR_HANDLE error, in both cases;
  2. whether I use the flag or not, if I do not apply pruning technique but only quantization, or none of the two techniques, and then I apply TFLite conversion, onto the device I do not get any errors.

So, it seems something related to the fact that I apply the pruning technique.

Thank for your support!

mohantym commented 2 years ago

Ok @ChicchiPhD ! Thanks for the update. Can we consider this as resolved then as it seems that you were able to resolve this issue by skipping prunning technique/other methods. Thank you!

ChicchiPhD commented 2 years ago

Hy @mohantym,

I would say that skipping pruning is particularly a workaround, more than a solution, but at the moment is fine for me. Investigating why the error happens when pruning is applied would be interesting, but we can consider this solved.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

mohantym commented 2 years ago

@ChicchiPhD ! Please share your pruning code snippets to check for any issue. Thank you.

fathiabdelmalek commented 1 year ago

I have the same problem, I have a Keras model converted to tflite and then to cc array, but the conversation to tlite model is simple without any addition flags, and I'm getting the same error but without a solution

Failed to get registration from op code CUSTOM

I have tried both tflite::AllOpsResolver and tflite::MicroMutableOpResolver. Any solution for this problem, I really need a solution to complete my thesis project.

hkayann commented 7 months ago

@fathiabdelmalek is that an LSTM model? Can you share the output from netron app?

AdityaB-01 commented 3 months ago

I have the same problem. I have added the code I am working on and the SVG of the Netron app. There re two models that I need to run on my ESP32C3. (https://drive.google.com/file/d/15as7adLykxOzsaAxV6l9UTe_EHS_WqXw/view?usp=sharing) state_model tflite temp_model tflite

error: ESP-ROM:esp32c3-api1-20210207 Build:Feb 7 2021 rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT) SPIWP:0xee mode:DIO, clock div:1 load:0x3fcd5820,len:0x458 load:0x403cc710,len:0x814 load:0x403ce710,len:0x2880 entry 0x403cc710 **Failed to get registration from op code CUSTOM

AllocateTensors() failed**

models:

  1. def create_state_model(input_dim): model = Sequential() model.add(Reshape((input_dim, 1), input_shape=(input_dim,))) model.add(LSTM(64, return_sequences=False)) model.add(Dropout(0.3)) model.add(Dense(32, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) return model

  2. def create_temp_model(input_dim, output_dim): model = Sequential() model.add(Reshape((input_dim, 1), input_shape=(input_dim,))) model.add(LSTM(64, return_sequences=False)) model.add(Dropout(0.3)) model.add(Dense(32, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(output_dim, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) return model

This is the code that I have used for conversion:

converter = tf.lite.TFLiteConverter.from_keras_model(state_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops. tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops. ]

state_model_tflite = converter.convert() with open('state_model.tflite', 'wb') as f: f.write(state_model_tflite)

converter = tf.lite.TFLiteConverter.from_keras_model(temp_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops. tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops. ]

temp_model_tflite = converter.convert() with open('temp_model.tflite', 'wb') as f: f.write(temp_model_tflite)