google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.
Apache License 2.0
195 stars 21 forks source link

Not Able to run converted gemma model in tflite because of custom op odml.update_kv_cache #79

Open akshatshah17 opened 2 days ago

akshatshah17 commented 2 days ago

Description of the bug:

I was able to convert the gemma model from safetensors to tflite through the converter but wasn't able to run it because of this error.

INFO: Loaded model tiny_llama_seq512_kv1024_float32.tflite INFO: Initialized TensorFlow Lite runtime. INFO: Created TensorFlow Lite XNNPACK delegate for CPU. ERROR: Encountered unresolved custom op: odml.update_kv_cache. See instructions: https://www.tensorflow.org/lite/guide/ops_custom ERROR: Node number 44 (odml.update_kv_cache) failed to prepare. ERROR: Encountered unresolved custom op: odml.update_kv_cache. See instructions: https://www.tensorflow.org/lite/guide/ops_custom ERROR: Node number 44 (odml.update_kv_cache) failed to prepare. ERROR: Failed to allocate tensors! ERROR: Benchmarking failed.

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

haozha111 commented 2 days ago

quick question: are you running the model via MediaPipe or example C++ inference? We documented how to run the pipeline here: https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative#end-to-end-inference-pipeline

The issue you encountered is due to missing registration for the custom op, which the above pipeline should automatically include. @hheydary.

akshatshah17 commented 2 days ago

I was trying to run it with tfite's (version 2.16.1) benchmark_model in that I was getting this error