Open akshatshah17 opened 2 days ago
quick question: are you running the model via MediaPipe or example C++ inference? We documented how to run the pipeline here: https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative#end-to-end-inference-pipeline
The issue you encountered is due to missing registration for the custom op, which the above pipeline should automatically include. @hheydary.
I was trying to run it with tfite's (version 2.16.1) benchmark_model in that I was getting this error
Description of the bug:
I was able to convert the gemma model from safetensors to tflite through the converter but wasn't able to run it because of this error.
INFO: Loaded model tiny_llama_seq512_kv1024_float32.tflite INFO: Initialized TensorFlow Lite runtime. INFO: Created TensorFlow Lite XNNPACK delegate for CPU. ERROR: Encountered unresolved custom op: odml.update_kv_cache. See instructions: https://www.tensorflow.org/lite/guide/ops_custom ERROR: Node number 44 (odml.update_kv_cache) failed to prepare. ERROR: Encountered unresolved custom op: odml.update_kv_cache. See instructions: https://www.tensorflow.org/lite/guide/ops_custom ERROR: Node number 44 (odml.update_kv_cache) failed to prepare. ERROR: Failed to allocate tensors! ERROR: Benchmarking failed.
Actual vs expected behavior:
No response
Any other information you'd like to share?
No response