microsoft / onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.
MIT License
1.22k stars 338 forks source link

android app token to string issue #484

Open j0h0k0i0m opened 1 month ago

j0h0k0i0m commented 1 month ago

Hello. I am trying to run phi-3.5 ONNX on Android. I'm reaching out because I'm not sure how to resolve the issue related to token to string conversion. This occurs when I instruct it to output in a different language, and I confirm through the logs that the output is not being generated.

2024-10-23 17:06:06.369 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 31734: 안
2024-10-23 17:06:06.369 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 1.32 seconds
2024-10-23 17:06:06.433 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 238: 
2024-10-23 17:06:06.433 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.064 seconds
2024-10-23 17:06:06.495 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 136: 
2024-10-23 17:06:06.495 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.062 seconds
2024-10-23 17:06:06.556 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 152: 
2024-10-23 17:06:06.556 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.061 seconds
2024-10-23 17:06:06.618 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 30944: 하
2024-10-23 17:06:06.618 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.062 seconds
2024-10-23 17:06:06.679 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 31578: 세
2024-10-23 17:06:06.679 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.061 seconds
2024-10-23 17:06:06.682 29366-29366 time.genai.demo         ai.onnxruntime.genai.demo            I  Waiting for a blocking GC NativeAlloc
2024-10-23 17:06:06.692 29366-29366 time.genai.demo         ai.onnxruntime.genai.demo            I  WaitForGcToComplete blocked NativeAlloc on NativeAlloc for 10.871ms
2024-10-23 17:06:06.695 29366-29378 InputTransport          ai.onnxruntime.genai.demo            D  Input channel destroyed: 'ClientS', fd=135
2024-10-23 17:06:06.759 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 31527: 요
2024-10-23 17:06:06.759 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Time taken to generate token: 0.08 seconds
2024-10-23 17:06:06.820 29366-29512 genai.demo.MainActivity ai.onnxruntime.genai.demo            I  Generated token: 29991: !

When I execute the phi 3.5 tokenizer in Python, the output is 안녕하세요!, but the android output is 안하세요!. I want to decode three tokens([238, 136, 152]) to obtain the correct results. I would appreciate any guidance on how to achieve this. Thank you.

vraspar commented 2 days ago

Hi! Just to clarify, is android app generating different tokens compared to python example or is it decoding tokens to string incorrectly?

j0h0k0i0m commented 1 day ago

Dear @vraspar

Hello.

It's an issue with token decoding. Unlike English, the Phi model contains only a few Korean tokens, so generating a single Korean character often requires a combination of multiple tokens.

I resolved the issue where the decoded string from a token wasn’t output by storing it and combining it with the next token before outputting. However, this is a temporary workaround.

In MainActivity.java, I checked String tok using isEmpty() to decide whether to store or output it. I hope this can be helpful as a reference.

baijumeswani commented 20 hours ago

@j0h0k0i0m Could you share the prompt you're using with the phi3.5 model? And what you expect as the returned decoded string? cc @wenbingl