nigelzzzzzzz commented 2 months ago

Description of the bug:

Hi @pkgoogle , i used example c++ code to inference model i transfer, it can show some error.

my command

bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama_q8_seq512_ekv1024.tflite  --sentencepiece_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama_v1.1/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:"

my output


bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama_q8_seq512_ekv1024.tflite  --sentencepiece_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama_v1.1/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:" 
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/zlib.net/fossils/zlib-1.3.1.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
INFO: Analyzed target //ai_edge_torch/generative/examples/cpp:text_generator_main (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //ai_edge_torch/generative/examples/cpp:text_generator_main up-to-date:
bazel-bin/ai_edge_torch/generative/examples/cpp/text_generator_main
INFO: Elapsed time: 2.594s, Critical Path: 0.05s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/ai_edge_torch/generative/examples/cpp/text_generator_main '--tflite_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama_q8_seq512_ekv1024.tflite' '--sentencepiece_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama_v1.1/tokenizer.model' '--start_token=<bos>' '--stop_token=<eos>' '--num_threads=16' '--prompt=Write an email:'
normalizer.cc(52) LOG(INFO) precompiled_charsmap is empty. use identity normalization.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: external/org_tensorflow/tensorflow/lite/core/subgraph.cc:2391 data_ptr_value % kDefaultTensorAlignment == 0 was not true.
Error at ai_edge_torch/generative/examples/cpp/text_generator_main.cc:164



### Actual vs expected behavior:

_No response_

### Any other information you'd like to share?

_No response_

pkgoogle commented 2 months ago

Hi @nigelzzzzzzz what version of the code did you use to produce the tflite_model, and what version of the code did you use when doing the actual command?

nigelzzzzzzz commented 2 months ago

Hi @pkgoogle, i am using main branch,

my command like

bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=PATH/gemma_it.tflite  --sentencepiece_model=PATH/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:" --weight_cache_path=PATH/gemma.xnnpack_cache

akshatshah17 commented 2 months ago

@nigelzzzzzzz can you please help me to convert the tiny lamma model to tflite I have tried with several nightly builds but not able to convert to tflite can you please tell me which nightly build u use, and in the convert_to_tflite.py file only file name needs change right?

Ramees025 commented 2 months ago

bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama_q8_seq512_ekv1024.tflite --sentencepiece_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama_v1.1/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:"

@pkgoogle , shouldn't we change the start_token and stop_token as below for tiny_llama? --start_token="<s>" --stop_token="</s>"

nigelzzzzzzz commented 2 months ago

hi @akshatshah17, i used main branch, you can install it by yourself

git clone https://github.com/google-ai-edge/ai-edge-torch

generate python library


# Install necessary dependencies
pip install setuptools wheel

Create the .whl file using setup.py

python setup.py sdist bdist_wheel



- then you can see `./dist/ai_edge_torch-0.3.0-py3-none-any.whl`
finally do `ai_edge_torch-0.3.0-py3-none-any.whl`

pkgoogle commented 2 months ago

I was able to replicate with main branch and similar but slightly different steps:

bazel build -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main
cd bazel-bin/ai_edge_torch/generative/examples/cpp
# copy converted model and tokenizer model here
./text_generator_main --tflite_model=tinyllama_q8_seq1024_ekv1280.tflite --sentencepiece_model=tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:"

We'll take a deeper look. Thanks.

nigelzzzzzzz commented 2 months ago

Hi @pkgoogle, i found solution,just added kTfLiteCustomAllocationFlagsSkipAlignCheck in flag can bypass the error

 @@ -154,6 +154,8 @@ tflite::SignatureRunner* GetSignatureRunner(
     std::map<std::string, std::vector<float>>& kv_cache) {
   tflite::SignatureRunner* runner =
       interpreter->GetSignatureRunner(signature_name.c_str());
+  int64_t f = 0;
+  f |= kTfLiteCustomAllocationFlagsSkipAlignCheck;
   for (auto& [name, cache] : kv_cache) {
     TfLiteCustomAllocation allocation = {
         .data = static_cast<void*>(cache.data()),
@@ -162,9 +164,9 @@ tflite::SignatureRunner* GetSignatureRunner(
     // delegates support this in-place update. For those cases, we need to do
     // a ping-pong buffer and update the pointers between inference calls.
     TFLITE_MINIMAL_CHECK(runner->SetCustomAllocationForInputTensor(
-                             name.c_str(), allocation) == kTfLiteOk);
+                             name.c_str(), allocation,f) == kTfLiteOk);
     TFLITE_MINIMAL_CHECK(runner->SetCustomAllocationForOutputTensor(
-                             name.c_str(), allocation) == kTfLiteOk);
+                             name.c_str(), allocation,f) == kTfLiteOk);

pkgoogle commented 2 months ago

Hi @nigelzzzzzzz, that alignment check is probably there for a reason -- but if you make a PR, we can review it.

nigelzzzzzzz commented 1 month ago

hi @pkgoogle, thanks for your response, i already open a pull request.

thanks you again.

Ramees025 commented 1 month ago

I also faced this issue when running for x86. But not with android_arm64.

google-ai-edge / ai-edge-torch

data_ptr_value % kDefaultTensorAlignment == 0 was not true. #237

Description of the bug:

Create the .whl file using setup.py