google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.
Apache License 2.0
373 stars 51 forks source link

data_ptr_value % kDefaultTensorAlignment == 0 was not true. #237

Open nigelzzzzzzz opened 2 months ago

nigelzzzzzzz commented 2 months ago

Description of the bug:

Hi @pkgoogle , i used example c++ code to inference model i transfer, it can show some error.



### Actual vs expected behavior:

_No response_

### Any other information you'd like to share?

_No response_
pkgoogle commented 2 months ago

Hi @nigelzzzzzzz what version of the code did you use to produce the tflite_model, and what version of the code did you use when doing the actual command?

nigelzzzzzzz commented 2 months ago

Hi @pkgoogle, i am using main branch,

my command like

bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=PATH/gemma_it.tflite  --sentencepiece_model=PATH/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:" --weight_cache_path=PATH/gemma.xnnpack_cache
akshatshah17 commented 2 months ago

@nigelzzzzzzz can you please help me to convert the tiny lamma model to tflite I have tried with several nightly builds but not able to convert to tflite can you please tell me which nightly build u use, and in the convert_to_tflite.py file only file name needs change right?

Ramees025 commented 2 months ago

bazel run -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main -- --tflite_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama_q8_seq512_ekv1024.tflite --sentencepiece_model=/home/nigel/opensource/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/TinyLlama_v1.1/tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:"

@pkgoogle , shouldn't we change the start_token and stop_token as below for tiny_llama? --start_token="<s>" --stop_token="</s>"

nigelzzzzzzz commented 2 months ago

hi @akshatshah17, i used main branch, you can install it by yourself

Create the .whl file using setup.py

python setup.py sdist bdist_wheel



- then you can see `./dist/ai_edge_torch-0.3.0-py3-none-any.whl`
finally do `ai_edge_torch-0.3.0-py3-none-any.whl`
pkgoogle commented 2 months ago

I was able to replicate with main branch and similar but slightly different steps:

bazel build -c opt //ai_edge_torch/generative/examples/cpp:text_generator_main
cd bazel-bin/ai_edge_torch/generative/examples/cpp
# copy converted model and tokenizer model here
./text_generator_main --tflite_model=tinyllama_q8_seq1024_ekv1280.tflite --sentencepiece_model=tokenizer.model --start_token="<bos>" --stop_token="<eos>" --num_threads=16 --prompt="Write an email:"

We'll take a deeper look. Thanks.

nigelzzzzzzz commented 2 months ago

Hi @pkgoogle, i found solution,just added kTfLiteCustomAllocationFlagsSkipAlignCheck in flag can bypass the error

 @@ -154,6 +154,8 @@ tflite::SignatureRunner* GetSignatureRunner(
     std::map<std::string, std::vector<float>>& kv_cache) {
   tflite::SignatureRunner* runner =
       interpreter->GetSignatureRunner(signature_name.c_str());
+  int64_t f = 0;
+  f |= kTfLiteCustomAllocationFlagsSkipAlignCheck;
   for (auto& [name, cache] : kv_cache) {
     TfLiteCustomAllocation allocation = {
         .data = static_cast<void*>(cache.data()),
@@ -162,9 +164,9 @@ tflite::SignatureRunner* GetSignatureRunner(
     // delegates support this in-place update. For those cases, we need to do
     // a ping-pong buffer and update the pointers between inference calls.
     TFLITE_MINIMAL_CHECK(runner->SetCustomAllocationForInputTensor(
-                             name.c_str(), allocation) == kTfLiteOk);
+                             name.c_str(), allocation,f) == kTfLiteOk);
     TFLITE_MINIMAL_CHECK(runner->SetCustomAllocationForOutputTensor(
-                             name.c_str(), allocation) == kTfLiteOk);
+                             name.c_str(), allocation,f) == kTfLiteOk);
pkgoogle commented 2 months ago

Hi @nigelzzzzzzz, that alignment check is probably there for a reason -- but if you make a PR, we can review it.

nigelzzzzzzz commented 1 month ago

hi @pkgoogle, thanks for your response, i already open a pull request.

thanks you again.

Ramees025 commented 1 month ago

I also faced this issue when running for x86. But not with android_arm64.