Open nigelzzz opened 3 weeks ago
Hi @nigelzzz, I'm getting this issue when I run your command:
ERROR: Didn't find op for builtin opcode 'STABLEHLO_COMPOSITE' version '1'. An older version of this builtin might be supported. Are you using an old TFLite binary with a newer model?
ERROR: Registration failed.
Error at ai_edge_torch/generative/examples/c++/text_generator_main.cc:93
Can you either share/provide your .tflite file or provide reproduce steps for your model conversion? i.e. the branch/version you are on when you do the conversion, whether you are using cuda when converting as well. Generally the more information I know, the faster we can help you. Is there any special reason your model name is ttiny_llama_seq512_kv1024
? or is that just a typo? Thanks.
Hi @pkgoogle,
ttiny_ llama_seq512_kv1024
is typo, i just classify quantize or no quantize.
by the way i think the root cause is WORKSPACE tensorflow version isn't correctly
/user/: CC=/usr/bin/clang-18 bazel run -c opt //ai_edge_torch/generative/examples/c++:text_generator_main -- --tflite_model=/mnt/data/nigel_wang/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/ttiny_llama_seq512_kv1024.tflite --sentencepiece_model=/mnt/data/nigel_wang/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama-1.1B-Chat-v1.0/tokenizer.model --prompt="<|user|> \n Write and email:\n <|assistant|>" --start_token="<s>" --stop_token="</s>" --num_threads=1
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
DEBUG: /mnt/data/nigel_wang/tensorflow_cache/153a550227f3ff2fa4e4811633058a05/external/org_tensorflow/third_party/repo.bzl:132:14:
Warning: skipping import of repository 'com_google_absl' because it already exists.
DEBUG: /mnt/data/nigel_wang/tensorflow_cache/153a550227f3ff2fa4e4811633058a05/external/org_tensorflow/third_party/repo.bzl:132:14:
Warning: skipping import of repository 'XNNPACK' because it already exists.
INFO: Analyzed target //ai_edge_torch/generative/examples/c++:text_generator_main (147 packages loaded, 3826 targets configured).
INFO: From Compiling src/google/protobuf/generated_message_tctable_lite.cc [for tool]:
external/protobuf~/src/google/protobuf/generated_message_tctable_lite.cc:347:14: warning: unused function 'Offset' [-Wunused-function]
347 | inline void* Offset(void* base, uint32_t offset) {
| ^~~~~~
1 warning generated.
INFO: From Compiling src/google/protobuf/compiler/cpp/helpers.cc [for tool]:
external/protobuf~/src/google/protobuf/compiler/cpp/helpers.cc:197:25: warning: unused function 'VerifyInt32TypeToVerifyCustom' [-Wunused-function]
197 | inline VerifySimpleType VerifyInt32TypeToVerifyCustom(VerifyInt32Type t) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
INFO: From Executing genrule @@org_tensorflow//tensorflow/lite/acceleration/configuration:configuration_schema:
When you use --proto, that you should check for conformity yourself, using the existing --conform
INFO: Found 1 target...
Target //ai_edge_torch/generative/examples/c++:text_generator_main up-to-date:
bazel-bin/ai_edge_torch/generative/examples/c++/text_generator_main
INFO: Elapsed time: 276.290s, Critical Path: 109.56s
INFO: 1493 processes: 601 internal, 892 linux-sandbox.
INFO: Build completed successfully, 1493 total actions
INFO: Running command line: bazel-bin/ai_edge_torch/generative/examples/c++/text_generator_main '--tflite_model=/mnt/data/nigel_wang/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/ttiny_llama_seq512_kv1024.tflite' '--sentencepiece_model=/mnt/data/nigel_wang/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama-1.1B-Chat-v1.0/tokenizer.model' '--prompt=<|user|> \n Write and email:\n <|assistant|>' '--start_token=<s>' '--stop_token=</s>' '--num_threads=1'
ERROR: Didn't find op for builtin opcode 'STABLEHLO_COMPOSITE' version '1'. An older version of this builtin might be supported. Are you using an old TFLite binary with a newer model?
ERROR: Registration failed.
Error at ai_edge_torch/generative/examples/c++/text_generator_main.cc:93
- above this error, i see in newer version has add `stablehlo_composite`,
https://github.com/tensorflow/tensorflow/commit/f4f2393888af78879dc9b299786023fe87fbbcfc
- in WORKSPACE version, doesn't add
- _TENSORFLOW_GIT_COMMIT = "26d4ea90364daa14bbb2bc5c2aa68f5b70c4641f"
- https://github.com/tensorflow/tensorflow/blob/26d4ea90364daa14bbb2bc5c2aa68f5b70c4641f/tensorflow/lite/core/kernels/register.cc#L385
and i have other question in this ticket. https://github.com/google-ai-edge/ai-edge-torch/issues/109
or i can help open a pr to fix it
Hi @nigelzzz, thanks for the info -- I'm having trouble building against that particular commit/version of TF -- did you modify your BUILD file? https://github.com/google-ai-edge/ai-edge-torch/blob/release/0.2.0/ai_edge_torch/generative/examples/c%2B%2B/BUILD
You can open a PR if you feel it is actually fixing the root cause. And this might just be a new issue as HEAD/nightly should work as well :).
Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs.
Hi @pkgoogle , I just do some patch in my local tensorflow library. I guess we just update tensorflow version, change version in WORKSPACE file. e.g., (https://github.com/google-ai-edge/ai-edge-torch/blob/main/WORKSPACE)
Hi @nigelzzz, for better reproducibility can you produce a diff between your local files and the github repo (maybe pull the latest changes).
something like this:
# navigate to tf root
git diff origin/master > diff.txt
Then share/upload that diff.txt file... that will help me a lot, thanks.
Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs.
Description of the bug:
ai-edge-torch version: 2.0
command
output
reference
I found some points,
if i set quantize equal false, it can inference correctly
.quantize bool = True : can decode successfully.
quantize bool = false : fail decode. e.g., above log, all is ?? def convert_tiny_llama_to_tflite( checkpoint_path: str, prefill_seq_len: int = 512, kv_cache_max_len: int = 1024, quantize: bool = True, ):
Actual vs expected behavior:
No response
Any other information you'd like to share?
No response