Bug: Unreadable output from android example project

xunuohope1107 commented 3 days ago

What happened?

I have built and run the android example project under examples/llama.android, but found the output from the android UI is very hard to understand. I try the a lot of prompt like "hello", "why sky is blue?" on several real devices as well as virtual devices. The output is not a sentence but a random combination of words or programming code.

Name and Version

b3785, android arm64-v8a

What operating system are you seeing the problem on?

Android arm64-v8a

Relevant log output

Here is the log from my console (use "hello" as prompt):
`2024-09-20 09:28:35.402 15885-5371  llama-android.cpp       com.example.llama                    I  n_len = 64, n_ctx = 2048, n_kv_req = 64
2024-09-20 09:28:35.402 15885-5371  llama-android.cpp       com.example.llama                    I  hello
2024-09-20 09:28:35.402 15885-5371  llama-android.cpp       com.example.llama                    I   
2024-09-20 09:28:35.402 15885-15885 ViewRootImplExtImpl     com.example.llama                    D  MotionEvent MotionEvent { action=ACTION_UP, actionButton=0, id[0]=0, x[0]=121.26953, y[0]=1821.7549, toolType[0]=TOOL_TYPE_FINGER, buttonState=0, classification=NONE, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=101595713, downTime=101595640, deviceId=6, source=0x1002, displayId=0, eventId=224023115 } handled by client, just return
2024-09-20 09:28:35.441 15885-15885 Quality                 com.example.llama                    I  Skipped: false 3 cost 30.282692 refreshRate 8290371 bit true processName com.example.llama
2024-09-20 09:28:35.834 15885-5371  llama-android.cpp       com.example.llama                    I  cached: llo, new_token_chars: `llo`, id: 18798
2024-09-20 09:28:36.017 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  world, new_token_chars: ` world`, id: 995
2024-09-20 09:28:36.172 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ", new_token_chars: `"`, id: 1
2024-09-20 09:28:36.296 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:36.432 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:36.573 15885-5371  llama-android.cpp       com.example.llama                    I  cached: def, new_token_chars: `def`, id: 4299
2024-09-20 09:28:36.718 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  remove, new_token_chars: ` remove`, id: 4781
2024-09-20 09:28:36.849 15885-5371  llama-android.cpp       com.example.llama                    I  cached: _, new_token_chars: `_`, id: 62
2024-09-20 09:28:36.982 15885-5371  llama-android.cpp       com.example.llama                    I  cached: v, new_token_chars: `v`, id: 85
2024-09-20 09:28:37.103 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ow, new_token_chars: `ow`, id: 322
2024-09-20 09:28:37.229 15885-5371  llama-android.cpp       com.example.llama                    I  cached: els, new_token_chars: `els`, id: 1424
2024-09-20 09:28:37.348 15885-5371  llama-android.cpp       com.example.llama                    I  cached: (, new_token_chars: `(`, id: 7
2024-09-20 09:28:37.477 15885-5371  llama-android.cpp       com.example.llama                    I  cached: s, new_token_chars: `s`, id: 82
2024-09-20 09:28:37.598 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ):, new_token_chars: `):`, id: 2599
2024-09-20 09:28:37.726 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:37.870 15885-5371  llama-android.cpp       com.example.llama                    I  cached:     , new_token_chars: `    `, id: 50284
2024-09-20 09:28:37.991 15885-5371  llama-android.cpp       com.example.llama                    I  cached: v, new_token_chars: `v`, id: 85
2024-09-20 09:28:38.120 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ow, new_token_chars: `ow`, id: 322
2024-09-20 09:28:38.248 15885-5371  llama-android.cpp       com.example.llama                    I  cached: els, new_token_chars: `els`, id: 1424
2024-09-20 09:28:38.381 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  =, new_token_chars: ` =`, id: 796
2024-09-20 09:28:38.512 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  ", new_token_chars: ` "`, id: 366
2024-09-20 09:28:38.636 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ae, new_token_chars: `ae`, id: 3609
2024-09-20 09:28:38.892 15885-5371  llama-android.cpp       com.example.llama                    I  cached: i, new_token_chars: `i`, id: 72
2024-09-20 09:28:39.029 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ou, new_token_chars: `ou`, id: 280
2024-09-20 09:28:39.141 15885-5371  llama-android.cpp       com.example.llama                    I  cached: AE, new_token_chars: `AE`, id: 14242
2024-09-20 09:28:39.276 15885-5371  llama-android.cpp       com.example.llama                    I  cached: I, new_token_chars: `I`, id: 40
2024-09-20 09:28:39.393 15885-5371  llama-android.cpp       com.example.llama                    I  cached: OU, new_token_chars: `OU`, id: 2606
2024-09-20 09:28:39.535 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ", new_token_chars: `"`, id: 1
2024-09-20 09:28:39.685 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:39.815 15885-5371  llama-android.cpp       com.example.llama                    I  cached:     , new_token_chars: `    `, id: 50284
2024-09-20 09:28:39.950 15885-5371  llama-android.cpp       com.example.llama                    I  cached: return, new_token_chars: `return`, id: 7783
2024-09-20 09:28:40.083 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  ', new_token_chars: ` '`, id: 705
2024-09-20 09:28:40.227 15885-5371  llama-android.cpp       com.example.llama                    I  cached: '., new_token_chars: `'.`, id: 4458
2024-09-20 09:28:40.357 15885-5371  llama-android.cpp       com.example.llama                    I  cached: join, new_token_chars: `join`, id: 22179
2024-09-20 09:28:40.496 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ([, new_token_chars: `([`, id: 26933
2024-09-20 09:28:40.611 15885-5371  llama-android.cpp       com.example.llama                    I  cached: c, new_token_chars: `c`, id: 66
2024-09-20 09:28:40.734 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  for, new_token_chars: ` for`, id: 329
2024-09-20 09:28:40.853 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  c, new_token_chars: ` c`, id: 269
2024-09-20 09:28:40.984 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  in, new_token_chars: ` in`, id: 287
2024-09-20 09:28:41.271 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  s, new_token_chars: ` s`, id: 264
2024-09-20 09:28:41.465 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  if, new_token_chars: ` if`, id: 611
2024-09-20 09:28:41.604 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  c, new_token_chars: ` c`, id: 269
2024-09-20 09:28:41.788 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  not, new_token_chars: ` not`, id: 407
2024-09-20 09:28:41.913 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  in, new_token_chars: ` in`, id: 287
2024-09-20 09:28:42.050 15885-5371  llama-android.cpp       com.example.llama                    I  cached:  vow, new_token_chars: ` vow`, id: 23268
2024-09-20 09:28:42.178 15885-5371  llama-android.cpp       com.example.llama                    I  cached: els, new_token_chars: `els`, id: 1424
2024-09-20 09:28:42.308 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ]), new_token_chars: `])`, id: 12962
2024-09-20 09:28:42.435 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:42.563 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:42.695 15885-5371  llama-android.cpp       com.example.llama                    I  cached: print, new_token_chars: `print`, id: 4798
2024-09-20 09:28:42.828 15885-5371  llama-android.cpp       com.example.llama                    I  cached: (, new_token_chars: `(`, id: 7
2024-09-20 09:28:42.964 15885-5371  llama-android.cpp       com.example.llama                    I  cached: remove, new_token_chars: `remove`, id: 28956
2024-09-20 09:28:43.116 15885-5371  llama-android.cpp       com.example.llama                    I  cached: _, new_token_chars: `_`, id: 62
2024-09-20 09:28:43.249 15885-5371  llama-android.cpp       com.example.llama                    I  cached: v, new_token_chars: `v`, id: 85
2024-09-20 09:28:43.390 15885-5371  llama-android.cpp       com.example.llama                    I  cached: ow, new_token_chars: `ow`, id: 322
2024-09-20 09:28:43.511 15885-5371  llama-android.cpp       com.example.llama                    I  cached: els, new_token_chars: `els`, id: 1424
2024-09-20 09:28:43.642 15885-5371  llama-android.cpp       com.example.llama                    I  cached: (, new_token_chars: `(`, id: 7
2024-09-20 09:28:43.776 15885-5371  llama-android.cpp       com.example.llama                    I  cached: s, new_token_chars: `s`, id: 82
2024-09-20 09:28:43.922 15885-5371  llama-android.cpp       com.example.llama                    I  cached: )), new_token_chars: `))`, id: 4008
2024-09-20 09:28:44.098 15885-5371  llama-android.cpp       com.example.llama                    I  cached: 
                                                                                                    , new_token_chars: `
                                                                                                    `, id: 198
2024-09-20 09:28:44.282 15885-5371  llama

Screenshot:

Flyfish233 commented 2 days ago

Try to load other models. Not happening on b3787, qwen2-1_5b-instruct-q5_0.gguf

xunuohope1107 commented 2 days ago

Try to load other models. Not happening on b3787, qwen2-1_5b-instruct-q5_0.gguf

I tried with qwen2-1_5b-instruct-q5_0.gguf on b3788 (llama.android), but still got unreasonable output. I tested on Xiaomi 14(16G RAM), Oneplus 12R (16G RAM) and Pixel 4a, here is the log from console:

2024-09-20 17:56:05.691 13226-13226 ExtensionsLoader 2024-09-20 17:56:05.693 13226-14930 AdrenoGLES-0 Build Date : 06/04/23 OpenGL ES Shader Compiler Version: E031.41.03.36 Local Branch : Remote Branch : Remote Branch : Reconstruct Branch : 2024-09-20 17:56:05.693 13226-14930 AdrenoGLES-0 2024-09-20 17:56:05.693 13226-14930 AdrenoGLES-0 2024-09-20 17:56:05.693 13226-14930 AdrenoGLES-0 2024-09-20 17:56:05.695 13226-14930 AdrenoGLES-0 2024-09-20 17:56:05.699 13226-14930 AdrenoUtils 2024-09-20 17:56:05.703 13226-14930 BufferQueueProducer 2024-09-20 17:56:05.703 13226-14930 OpenGLRenderer 2024-09-20 17:56:05.705 13226-13226 SurfaceControl 2024-09-20 17:56:05.716 13226-14930 BLASTBufferQueue 2024-09-20 17:56:05.716 13226-14930 VRI[MainActivity] 2024-09-20 17:56:05.716 13226-13226 VRI[MainActivity] 2024-09-20 17:56:05.716 13226-13226 VRI[MainActivity] 2024-09-20 17:56:05.716 13226-13226 ViewRootImplExtImpl 2024-09-20 17:56:05.724 13226-13226 VRI[MainActivity] 2024-09-20 17:56:05.724 13226-13226 Choreographer 2024-09-20 17:56:05.724 13226-13226 Quality 2024-09-20 17:56:05.740 13226-13226 Quality 2024-09-20 17:56:06.156 13226-13226 Quality 2024-09-20 17:56:06.742 13226-14945 OplusScrollToTopManager 2024-09-20 17:56:08.669 13226-13226 AutofillManager 2024-09-20 17:56:08.737 13226-13226 OplusInput...erInternal 2024-09-20 17:56:08.741 13226-14967 LLamaAndroid 2024-09-20 17:56:08.742 13226-13226 ViewRootImplExtImpl 2024-09-20 17:56:08.747 13226-14967 LLamaAndroid 2024-09-20 17:56:08.748 13226-14967 llama-android.cpp 2024-09-20 17:56:08.753 13226-13226 Quality 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.779 13226-14967 llama-android.cpp 2024-09-20 17:56:08.780 13226-14967 llama-android.cpp 2024-09-20 17:56:08.780 13226-14967 llama-android.cpp 2024-09-20 17:56:08.780 13226-14967 llama-android.cpp 2024-09-20 17:56:08.780 13226-14967 llama-android.cpp 2024-09-20 17:56:08.780 13226-14967 llama-android.cpp 2024-09-20 17:56:08.804 13226-14967 llama-android.cpp 2024-09-20 17:56:08.809 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.826 13226-14967 llama-android.cpp 2024-09-20 17:56:08.936 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:08.975 13226-14967 llama-android.cpp 2024-09-20 17:56:09.133 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.143 13226-14967 llama-android.cpp 2024-09-20 17:56:09.152 13226-14967 llama-android.cpp 2024-09-20 17:56:09.152 13226-14967 llama-android.cpp 2024-09-20 17:56:09.152 13226-14967 llama-android.cpp 2024-09-20 17:56:09.153 13226-14967 llama-android.cpp 2024-09-20 17:56:09.153 13226-14967 llama-android.cpp 2024-09-20 17:56:09.153 13226-14967 llama-android.cpp 2024-09-20 17:56:09.153 13226-14967 LLamaAndroid 2024-09-20 17:56:09.175 13226-13226 Quality 2024-09-20 17:56:10.171 13226-13226 AutofillManager 2024-09-20 17:56:10.264 13226-13226 Compose Focus 2024-09-20 17:56:10.270 13226-13226 ViewRootImplExtImpl 2024-09-20 17:56:10.278 13226-13226 Quality 2024-09-20 17:56:10.322 13226-13226 ImeTracker 2024-09-20 17:56:10.322 13226-13226 InsetsController 2024-09-20 17:56:10.323 13226-13226 InputMethodManager 2024-09-20 17:56:10.341 13226-13226 Quality 2024-09-20 17:56:10.372 13226-13226 Quality 2024-09-20 17:56:10.385 13226-13226 FinalizerDaemon 2024-09-20 17:56:10.391 13226-14917 StrictMode at android.os.StrictMode$AndroidCloseGuardReporter.re at dalvik.system.CloseGuard.warnIfOpen(CloseGuard.jav at android.view.SurfaceControl.finalize(SurfaceContro at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemo at java.lang.Daemons$FinalizerDaemon.processReference at java.lang.Daemons$FinalizerDaemon.runInternal(Daem at java.lang.Daemons$Daemon.run(Daemons.java:145) at java.lang.Thread.run(Thread.java:1012) 2024-09-20 17:56:10.394 13226-13226 RecordingIC 2024-09-20 17:56:10.396 13226-13226 Quality 2024-09-20 17:56:10.412 13226-13226 Quality 2024-09-20 17:56:10.427 13226-13226 VRI[MainActivity] 2024-09-20 17:56:10.430 13226-13226 VRI[MainActivity] 2024-09-20 17:56:10.430 13226-13226 Quality 2024-09-20 17:56:10.443 13226-13226 InsetsController 2024-09-20 17:56:10.444 13226-14986 OplusWindowManager 2024-09-20 17:56:10.445 13226-13226 Quality 2024-09-20 17:56:10.462 13226-13226 Quality 2024-09-20 17:56:10.474 13226-13226 VRI[MainActivity] 2024-09-20 17:56:10.476 13226-13226 Quality 2024-09-20 17:56:10.491 13226-13226 Quality 2024-09-20 17:56:10.521 13226-13226 Quality 2024-09-20 17:56:10.683 13226-13226 ImeTracker 2024-09-20 17:56:10.814 13226-14990 ProfileInstaller 2024-09-20 17:56:11.247 13226-13226 Quality 2024-09-20 17:56:11.788 13226-13226 Quality 2024-09-20 17:56:12.601 13226-13226 RecordingIC 2024-09-20 17:56:12.633 13226-13226 WindowOnBackDispatcher 2024-09-20 17:56:12.640 13226-13226 VRI[MainActivity] 2024-09-20 17:56:12.826 13226-13226 ImeTracker 2024-09-20 17:56:12.830 13226-13226 ImeTracker 2024-09-20 17:56:12.857 13226-13226 VRI[MainActivity] 2024-09-20 17:56:12.859 13226-13226 VRI[MainActivity] 2024-09-20 17:56:13.408 13226-13226 AutofillManager 2024-09-20 17:56:13.419 13226-13226 Quality 2024-09-20 17:56:13.476 13226-13226 ViewRootImplExtImpl 2024-09-20 17:56:13.476 13226-14967 llama-android.cpp 2024-09-20 17:56:13.476 13226-14967 llama-android.cpp 2024-09-20 17:56:13.476 13226-14967 llama-android.cpp 2024-09-20 17:56:13.539 13226-13226 Quality 2024-09-20 17:56:13.645 13226-14967 llama-android.cpp 2024-09-20 17:56:13.736 13226-14967 llama-android.cpp 2024-09-20 17:56:13.759 13226-13226 Quality 2024-09-20 17:56:13.833 13226-14967 llama-android.cpp 2024-09-20 17:56:13.858 13226-13226 Quality 2024-09-20 17:56:13.953 13226-14967 llama-android.cpp 2024-09-20 17:56:14.044 13226-14967 llama-android.cpp 2024-09-20 17:56:14.064 13226-13226 Quality 2024-09-20 17:56:14.132 13226-14967 llama-android.cpp 2024-09-20 17:56:14.216 13226-14967 llama-android.cpp 2024-09-20 17:56:14.310 13226-14967 llama-android.cpp 2024-09-20 17:56:14.327 13226-13226 Quality 2024-09-20 17:56:14.403 13226-14967 llama-android.cpp 2024-09-20 17:56:14.494 13226-14967 llama-android.cpp 2024-09-20 17:56:14.572 13226-14967 llama-android.cpp 2024-09-20 17:56:14.595 13226-13226 Quality 2024-09-20 17:56:14.671 13226-14967 llama-android.cpp 2024-09-20 17:56:14.760 13226-14967 llama-android.cpp 2024-09-20 17:56:14.774 13226-13226 Quality 2024-09-20 17:56:14.849 13226-14967 llama-android.cpp 2024-09-20 17:56:14.867 13226-13226 Quality 2024-09-20 17:56:14.940 13226-14967 llama-android.cpp 2024-09-20 17:56:15.031 13226-14967 llama-android.cpp 2024-09-20 17:56:15.126 13226-14967 llama-android.cpp 2024-09-20 17:56:15.208 13226-14967 llama-android.cpp 2024-09-20 17:56:15.301 13226-14967 llama-android.cpp 2024-09-20 17:56:15.321 13226-13226 Quality 2024-09-20 17:56:15.385 13226-14967 llama-android.cpp 2024-09-20 17:56:15.405 13226-13226 Quality 2024-09-20 17:56:15.479 13226-14967 llama-android.cpp 2024-09-20 17:56:15.563 13226-14967 llama-android.cpp 2024-09-20 17:56:15.655 13226-14967 llama-android.cpp 2024-09-20 17:56:15.750 13226-14967 llama-android.cpp 2024-09-20 17:56:15.769 13226-13226 Quality 2024-09-20 17:56:15.842 13226-14967 llama-android.cpp , new_token_chars: , id: 198 2024-09-20 17:56:15.929 13226-14967 llama-android.cpp 2024-09-20 17:56:15.952 13226-13226 Quality 2024-09-20 17:56:16.022 13226-14967 llama-android.cpp 2024-09-20 17:56:16.042 13226-13226 Quality 2024-09-20 17:56:16.110 13226-14967 llama-android.cpp 2024-09-20 17:56:16.185 13226-14967 llama-android.cpp 2024-09-20 17:56:16.274 13226-14967 llama-android.cpp 2024-09-20 17:56:16.365 13226-14967 llama-android.cpp 2024-09-20 17:56:16.455 13226-14967 llama-android.cpp 2024-09-20 17:56:16.537 13226-14967 llama-android.cpp 2024-09-20 17:56:16.557 13226-13226 Quality 2024-09-20 17:56:16.625 13226-14967 llama-android.cpp 2024-09-20 17:56:16.708 13226-14967 llama-android.cpp 2024-09-20 17:56:16.797 13226-14967 llama-android.cpp 2024-09-20 17:56:16.884 13226-14967 llama-android.cpp 2024-09-20 17:56:16.976 13226-14967 llama-android.cpp 2024-09-20 17:56:17.064 13226-14967 llama-android.cpp 2024-09-20 17:56:17.159 13226-14967 llama-android.cpp 2024-09-20 17:56:17.244 13226-14967 llama-android.cpp 2024-09-20 17:56:17.328 13226-14967 llama-android.cpp 2024-09-20 17:56:17.421 13226-14967 llama-android.cpp 2024-09-20 17:56:17.506 13226-14967 llama-android.cpp 2024-09-20 17:56:17.596 13226-14967 llama-android.cpp 2024-09-20 17:56:17.682 13226-14967 llama-android.cpp 2024-09-20 17:56:17.774 13226-14967 llama-android.cpp 2024-09-20 17:56:17.864 13226-14967 llama-android.cpp 2024-09-20 17:56:17.950 13226-14967 llama-android.cpp 2024-09-20 17:56:18.034 13226-14967 llama-android.cpp 2024-09-20 17:56:18.057 13226-13226 Quality 2024-09-20 17:56:18.122 13226-14967 llama-android.cpp 2024-09-20 17:56:18.205 13226-14967 llama-android.cpp 2024-09-20 17:56:18.297 13226-14967 llama-android.cpp 2024-09-20 17:56:18.408 13226-14967 llama-android.cpp 2024-09-20 17:56:18.500 13226-14967 llama-android.cpp 2024-09-20 17:56:18.576 13226-14967 llama-android.cpp 2024-09-20 17:56:18.662 13226-14967 llama-android.cpp 2024-09-20 17:56:18.752 13226-14967 llama-android.cpp 2024-09-20 17:56:18.841 13226-14967 llama-android.cpp 2024-09-20 17:56:18.931 13226-14967 llama-android.cpp 2024-09-20 17:56:19.020 13226-14967 llama-android.cpp 2024-09-20 17:56:19.114 13226-14967 llama-android.cpp 2024-09-20 17:56:19.552 13226-1 com.example.llama D Opened libSchedAssistExtImpl.so com.example.llama I QUALCOMM build : 1a285a84ae, I2991b7e11e com.example.llama I Build Config : S P 14.1.4 AArch64 com.example.llama I Driver Path : /vendor/lib64/egl/libGLESv2_adreno.so com.example.llama I Driver Version : 0676.32 com.example.llama I PFP: 0x01740158, ME: 0x00000000 com.example.llama I : Reading chip ID through GSL com.example.llama D VRI[MainActivity]#0(BLAST Consumer)0 connect: api=1 producerControlledByApp=true com.example.llama E Unable to match the desired swap behavior. com.example.llama I setExtendedRangeBrightness sc=Surface(name=com.example.llama/com.example.llama.MainActivity)/@0x764580f,currentBufferRatio=1.0,desiredRatio=1.0 com.example.llama D VRI[MainActivity]#0 acquireNextBufferLocked size=1080x2376 mFrameNumber=1 applyTransaction=true mTimestamp=132046035059522(auto) mPendingTransactions.size=0 graphicBufferId=56805237456900 transform=0 com.example.llama D Received frameCommittedCallback lastAttemptedDrawFrameNum=1 didProduceBuffer=true syncBuffer=false com.example.llama D draw finished. com.example.llama D reportDrawFinished com.example.llama D setMaxDequeuedBufferCount: 2 com.example.llama D onFocusEvent true com.example.llama I Skipped 30 frames! The application may be doing too much work on its main thread. com.example.llama I Skipped: false 30 cost 250.18631 refreshRate 8289370 bit true processName com.example.llama com.example.llama I Skipped: true 1 cost 8.331621 refreshRate 8289194 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 10.590919 refreshRate 8289053 bit true processName com.example.llama com.example.llama D com.example.llama/com.example.llama.MainActivity,This DecorView@e676581[MainActivity] change focus to true com.example.llama V requestHideFillUi(null): anchor = null com.example.llama D get inputMethodManager extension: com.android.internal.view.IInputMethodManager$Stub$Proxy@89407a1 com.example.llama D Dedicated thread for native code: Llm-RunLoop com.example.llama D MotionEvent MotionEvent { action=ACTION_UP, actionButton=0, id[0]=0, x[0]=491.54004, y[0]=2261.8496, toolType[0]=TOOL_TYPE_FINGER, buttonState=0, classification=NONE, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=132049051, downTime=132048977, deviceId=6, source=0x1002, displayId=0, eventId=640003685 } handled by client, just return com.example.llama D AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 1 | SVE = 0 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | com.example.llama I Loading model from /storage/emulated/0/Android/data/com.example.llama/files/qwen2-1_5b-instruct-q5_0.gguf com.example.llama I Skipped: true 1 cost 12.096126 refreshRate 8333333 bit true processName com.example.llama com.example.llama I llama_model_loader: loaded meta data with 26 key-value pairs and 338 tensors from /storage/emulated/0/Android/data/com.example.llama/files/qwen2-1_5b-instruct-q5_0.gguf (version GGUF V3 (latest)) com.example.llama I llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. com.example.llama I llama_model_loader: - kv 0: general.architecture str = qwen2 com.example.llama I llama_model_loader: - kv 1: general.name str = qwen2-1_5b-instruct com.example.llama I llama_model_loader: - kv 2: qwen2.block_count u32 = 28 com.example.llama I llama_model_loader: - kv 3: qwen2.context_length u32 = 32768 com.example.llama I llama_model_loader: - kv 4: qwen2.embedding_length u32 = 1536 com.example.llama I llama_model_loader: - kv 5: qwen2.feed_forward_length u32 = 8960 com.example.llama I llama_model_loader: - kv 6: qwen2.attention.head_count u32 = 12 com.example.llama I llama_model_loader: - kv 7: qwen2.attention.head_count_kv u32 = 2 com.example.llama I llama_model_loader: - kv 8: qwen2.rope.freq_base f32 = 1000000.000000 com.example.llama I llama_model_loader: - kv 9: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001 com.example.llama I llama_model_loader: - kv 10: general.file_type u32 = 8 com.example.llama I llama_model_loader: - kv 11: tokenizer.ggml.model str = gpt2 com.example.llama I llama_model_loader: - kv 12: tokenizer.ggml.pre str = qwen2 com.example.llama I llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "", "&", "'", ... com.example.llama I llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... com.example.llama I llama_model_loader: - kv 15: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... com.example.llama I llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32 = 151645 com.example.llama I llama_model_loader: - kv 17: tokenizer.ggml.padding_token_id u32 = 151643 com.example.llama I llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 151643 com.example.llama I llama_model_loader: - kv 19: tokenizer.chat_template str = { 45358000342027376826582226793079494978536804014395455830443385929404386720687094409997545952899419186858395500209268036995739998909139943970409346798798085432955252350389385345239973876717475932972674074899806483906560.000000or message in messages }{ 0f lo... com.example.llama I llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = false com.example.llama I llama_model_loader: - kv 21: general.quantization_version u32 = 2 com.example.llama I llama_model_loader: - kv 22: quantize.imatrix.file str = ../Qwen2/gguf/qwen2-1_5b-imatrix/imat... com.example.llama I llama_model_loader: - kv 23: quantize.imatrix.dataset str = ../sft_2406.txt com.example.llama I llama_model_loader: - kv 24: quantize.imatrix.entries_count i32 = 196 com.example.llama I llama_model_loader: - kv 25: quantize.imatrix.chunks_count i32 = 1937 com.example.llama I llama_model_loader: - type f32: 141 tensors com.example.llama I llama_model_loader: - type q5_0: 193 tensors com.example.llama I llama_model_loader: - type q5_1: 3 tensors com.example.llama I llama_model_loader: - type q6_K: 1 tensors com.example.llama I llm_load_vocab: special tokens cache size = 293 com.example.llama I llm_load_vocab: token to piece cache size = 0.9338 MB com.example.llama I llm_load_print_meta: format = GGUF V3 (latest) com.example.llama I llm_load_print_meta: arch = qwen2 com.example.llama I llm_load_print_meta: vocab type = BPE com.example.llama I llm_load_print_meta: n_vocab = 151936 com.example.llama I llm_load_print_meta: n_merges = 151387 com.example.llama I llm_load_print_meta: vocab_only = 0 com.example.llama I llm_load_print_meta: n_ctx_train = 32768 com.example.llama I llm_load_print_meta: n_embd = 1536 com.example.llama I llm_load_print_meta: n_layer = 28 com.example.llama I llm_load_print_meta: n_head = 12 com.example.llama I llm_load_print_meta: n_head_kv = 2 com.example.llama I llm_load_print_meta: n_rot = 128 com.example.llama I llm_load_print_meta: n_swa = 0 com.example.llama I llm_load_print_meta: n_embd_head_k = 128 com.example.llama I llm_load_print_meta: n_embd_head_v = 128 com.example.llama I llm_load_print_meta: n_gqa = 6 com.example.llama I llm_load_print_meta: n_embd_k_gqa = 256 com.example.llama I llm_load_print_meta: n_embd_v_gqa = 256 com.example.llama I llm_load_print_meta: f_norm_eps = 0.0e+00 com.example.llama I llm_load_print_meta: f_norm_rms_eps = 1.0e-06 com.example.llama I llm_load_print_meta: f_clamp_kqv = 0.0e+00 com.example.llama I llm_load_print_meta: f_max_alibi_bias = 0.0e+00 com.example.llama I llm_load_print_meta: f_logit_scale = 0.0e+00 com.example.llama I llm_load_print_meta: n_ff = 8960 com.example.llama I llm_load_print_meta: n_expert = 0 com.example.llama I llm_load_print_meta: n_expert_used = 0 com.example.llama I llm_load_print_meta: causal attn = 1 com.example.llama I llm_load_print_meta: pooling type = 0 com.example.llama I llm_load_print_meta: rope type = 2 com.example.llama I llm_load_print_meta: rope scaling = linear com.example.llama I llm_load_print_meta: freq_base_train = 1000000.0 com.example.llama I llm_load_print_meta: freq_scale_train = 1 com.example.llama I llm_load_print_meta: n_ctx_orig_yarn = 32768 com.example.llama I llm_load_print_meta: rope_finetuned = unknown com.example.llama I llm_load_print_meta: ssm_d_conv = 0 com.example.llama I llm_load_print_meta: ssm_d_inner = 0 com.example.llama I llm_load_print_meta: ssm_d_state = 0 com.example.llama I llm_load_print_meta: ssm_dt_rank = 0 com.example.llama I llm_load_print_meta: ssm_dt_b_c_rms = 0 com.example.llama I llm_load_print_meta: model type = ?B com.example.llama I llm_load_print_meta: model ftype = Q5_0 com.example.llama I llm_load_print_meta: model params = 1.54 B com.example.llama I llm_load_print_meta: model size = 1.02 GiB (5.68 BPW) com.example.llama I llm_load_print_meta: general.name = qwen2-1_5b-instruct com.example.llama I llm_load_print_meta: BOS token = 151643 '<|endoftext|>' com.example.llama I llm_load_print_meta: EOS token = 151645 '<|im_end|>' com.example.llama I llm_load_print_meta: PAD token = 151643 '<|endoftext|>' com.example.llama I llm_load_print_meta: LF token = 148848 'ÄĬ' com.example.llama I llm_load_print_meta: EOT token = 151645 '<|im_end|>' com.example.llama I llm_load_print_meta: max token length = 256 com.example.llama I llm_load_tensors: ggml ctx size = 0.15 MiB com.example.llama I llm_load_tensors: CPU buffer size = 1044.62 MiB com.example.llama I Using 6 threads com.example.llama I llama_new_context_with_model: n_ctx = 2048 com.example.llama I llama_new_context_with_model: n_batch = 2048 com.example.llama I llama_new_context_with_model: n_ubatch = 512 com.example.llama I llama_new_context_with_model: flash_attn = 0 com.example.llama I llama_new_context_with_model: freq_base = 1000000.0 com.example.llama I llama_new_context_with_model: freq_scale = 1 com.example.llama I llama_kv_cache_init: CPU KV buffer size = 56.00 MiB com.example.llama I llama_new_context_with_model: KV self size = 56.00 MiB, K (f16): 28.00 MiB, V (f16): 28.00 MiB com.example.llama I llama_new_context_with_model: CPU output buffer size = 0.58 MiB com.example.llama I llama_new_context_with_model: CPU compute buffer size = 299.75 MiB com.example.llama I llama_new_context_with_model: graph nodes = 986 com.example.llama I llama_new_context_with_model: graph splits = 1 com.example.llama I Loaded model /storage/emulated/0/Android/data/com.example.llama/files/qwen2-1_5b-instruct-q5_0.gguf com.example.llama I Skipped: false 1 cost 11.889272 refreshRate 8288722 bit true processName com.example.llama com.example.llama V requestHideFillUi(null): anchor = null com.example.llama D Owner FocusChanged(true) com.example.llama D MotionEvent MotionEvent { action=ACTION_UP, actionButton=0, id[0]=0, x[0]=528.70605, y[0]=1516.375, toolType[0]=TOOL_TYPE_FINGER, buttonState=0, classification=NONE, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=132050570, downTime=132050480, deviceId=6, source=0x1002, displayId=0, eventId=574615136 } handled by client, just return com.example.llama I Skipped: false 2 cost 21.315496 refreshRate 8287714 bit true processName com.example.llama com.example.llama I com.example.llama:24cce9e6: onRequestShow at ORIGIN_CLIENT_SHOW_SOFT_INPUT reason SHOW_SOFT_INPUT_BY_INSETS_API com.example.llama D show(ime(), fromIme=false) com.example.llama D showSoftInput() view=androidx.compose.ui.platform.AndroidComposeView{af1f4d6 VFED..... .F....ID 0,0-1080,2208 aid=1073741824 viewInfo = } flags=0 reason=SHOW_SOFT_INPUT_BY_INSETS_API com.example.llama I Skipped: true 3 cost 26.484884 refreshRate 8287714 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 16.270636 refreshRate 8287714 bit true processName com.example.llama com.example.llama W type=1400 audit(0.0:2290): avc: denied { getopt } for path="/dev/socket/usap_pool_primary" scontext=u:r:untrusted_app:s0:c88,c257,c512,c768 tcontext=u:r:zygote:s0 tclass=unix_stream_socket permissive=0 app=com.example.llama com.example.llama D StrictMode policy violation: android.os.strictmode.LeakedClosableViolation: A resource was acquired at attached stack trace but never released. See java.io.Closeable for information on avoiding resource leaks. Callsite: InsetsSourceControl port(StrictMode.java:2097) a:338) l.java:1576) ns.java:339) (Daemons.java:324) ons.java:300) com.example.llama W requestCursorUpdates is not supported com.example.llama I Skipped: false 1 cost 14.975071 refreshRate 8287506 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 14.3007345 refreshRate 8287526 bit true processName com.example.llama com.example.llama W handleResized abandoned! com.example.llama W handleResized abandoned! com.example.llama I Skipped: false 1 cost 15.811308 refreshRate 8287533 bit true processName com.example.llama com.example.llama D show(ime(), fromIme=true) com.example.llama D get WMS extension: android.os.BinderProxy@1ae8041 com.example.llama I Skipped: false 1 cost 14.625874 refreshRate 8287530 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 14.671543 refreshRate 8287538 bit true processName com.example.llama com.example.llama W handleResized abandoned! com.example.llama I Skipped: false 1 cost 11.987819 refreshRate 8287549 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 10.011921 refreshRate 8287574 bit true processName com.example.llama com.example.llama I Skipped: false 1 cost 15.110632 refreshRate 8287625 bit true processName com.example.llama com.example.llama I com.example.llama:24cce9e6: onShown com.example.llama D Installing profile for com.example.llama com.example.llama I Skipped: false 7 cost 62.15717 refreshRate 8288450 bit true processName com.example.llama com.example.llama I Skipped: false 5 cost 47.37528 refreshRate 8288787 bit true processName com.example.llama com.example.llama W requestCursorUpdates is not supported com.example.llama W sendCancelIfRunning: isInProgress=falsecallback=ImeCallback=ImeOnBackInvokedCallback@140997194 Callback=android.window.IOnBackInvokedCallback$Stub$Proxy@dff351e com.example.llama W handleResized abandoned! com.example.llama I com.example.llama:3ecdc502: onRequestHide at ORIGIN_CLIENT_HIDE_SOFT_INPUT reason HIDE_SOFT_INPUT_BY_INSETS_API com.example.llama I com.example.llama:619f1067: onHidden com.example.llama W handleResized abandoned! com.example.llama W handleResized abandoned! com.example.llama V requestHideFillUi(null): anchor = null com.example.llama I Skipped: false 1 cost 12.059912 refreshRate 8288989 bit true processName com.example.llama com.example.llama D MotionEvent MotionEvent { action=ACTION_UP, actionButton=0, id[0]=0, x[0]=137.28906, y[0]=1685.7041, toolType[0]=TOOL_TYPE_FINGER, buttonState=0, classification=NONE, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=132053786, downTime=132053720, deviceId=6, source=0x1002, displayId=0, eventId=492846711 } handled by client, just return com.example.llama I n_len = 64, n_ctx = 2048, n_kv_req = 64 com.example.llama I hello com.example.llama I
com.example.llama I Skipped: false 6 cost 57.353714 refreshRate 8333333 bit true processName com.example.llama com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: ., new_token_chars: ., id: 13 com.example.llama I Skipped: false 1 cost 12.094041 refreshRate 8289116 bit true processName com.example.llama com.example.llama I cached: Let, new_token_chars: Let, id: 6771 com.example.llama I Skipped: false 1 cost 11.876371 refreshRate 8289015 bit true processName com.example.llama com.example.llama I cached: $, new_token_chars: $, id: 400 com.example.llama I cached: a, new_token_chars: a, id: 64 com.example.llama I Skipped: false 1 cost 10.482091 refreshRate 8289267 bit true processName com.example.llama com.example.llama I cached: ,b, new_token_chars: ,b, id: 8402 com.example.llama I cached: ,c, new_token_chars: ,c, id: 10109 com.example.llama I cached: $, new_token_chars: $, id: 3 com.example.llama I Skipped: false 1 cost 8.775117 refreshRate 8288756 bit true processName com.example.llama com.example.llama I cached: be, new_token_chars: be, id: 387 com.example.llama I cached: positive, new_token_chars: positive, id: 6785 com.example.llama I cached: real, new_token_chars: real, id: 1931 com.example.llama I Skipped: false 1 cost 11.006954 refreshRate 8288280 bit true processName com.example.llama com.example.llama I cached: numbers, new_token_chars: numbers, id: 5109 com.example.llama I cached: such, new_token_chars: such, id: 1741 com.example.llama I Skipped: false 1 cost 8.316478 refreshRate 8288244 bit true processName com.example.llama com.example.llama I cached: that, new_token_chars: that, id: 429 com.example.llama I Skipped: false 1 cost 10.274869 refreshRate 8288205 bit true processName com.example.llama com.example.llama I cached: $, new_token_chars: $, id: 400 com.example.llama I cached: a, new_token_chars: a, id: 64 com.example.llama I cached: +b, new_token_chars: +b, id: 35093 com.example.llama I cached: +c, new_token_chars: +c, id: 49138 com.example.llama I cached: =, new_token_chars: =, id: 28 com.example.llama I Skipped: true 1 cost 8.470056 refreshRate 8288194 bit true processName com.example.llama com.example.llama I cached: 3, new_token_chars: 3, id: 18 com.example.llama I Skipped: true 1 cost 9.017843 refreshRate 8288208 bit true processName com.example.llama com.example.llama I cached: $., new_token_chars: $., id: 12947 com.example.llama I cached: Pro, new_token_chars: Pro, id: 1298 com.example.llama I cached: ve, new_token_chars: ve, id: 586 com.example.llama I cached: that, new_token_chars: that, id: 429 com.example.llama I Skipped: true 1 cost 8.566671 refreshRate 8288414 bit true processName com.example.llama com.example.llama I cached: com.example.llama I cached: [, new_token_chars: \[, id: 78045 com.example.llama I Skipped: false 1 cost 8.996249 refreshRate 8288515 bit true processName com.example.llama com.example.llama I cached: \, new_token_chars: \, id: 1124 com.example.llama I Skipped: false 1 cost 8.481175 refreshRate 8288583 bit true processName com.example.llama com.example.llama I cached: frac, new_token_chars: frac, id: 37018 com.example.llama I cached: {, new_token_chars: {, id: 90 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }{, new_token_chars: }{, id: 15170 com.example.llama I cached: a, new_token_chars: a, id: 64 com.example.llama I cached: ^, new_token_chars: ^, id: 61 com.example.llama I Skipped: true 1 cost 8.769447 refreshRate 8288628 bit true processName com.example.llama com.example.llama I cached: 2, new_token_chars: 2, id: 17 com.example.llama I cached: +, new_token_chars: +, id: 10 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }, new_token_chars: }, id: 92 com.example.llama I cached: +\, new_token_chars: +\, id: 41715 com.example.llama I cached: frac, new_token_chars: frac, id: 37018 com.example.llama I cached: {, new_token_chars: {, id: 90 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }{, new_token_chars: }{, id: 15170 com.example.llama I cached: b, new_token_chars: b, id: 65 com.example.llama I cached: ^, new_token_chars: ^, id: 61 com.example.llama I cached: 2, new_token_chars: 2, id: 17 com.example.llama I cached: +, new_token_chars: +, id: 10 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }, new_token_chars: }, id: 92 com.example.llama I cached: +\, new_token_chars: +\, id: 41715 com.example.llama I cached: frac, new_token_chars: frac, id: 37018 com.example.llama I Skipped: true 1 cost 9.095158 refreshRate 8288880 bit true processName com.example.llama com.example.llama I cached: {, new_token_chars: {, id: 90 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }{, new_token_chars: }{, id: 15170 com.example.llama I cached: c, new_token_chars: c, id: 66 com.example.llama I cached: ^, new_token_chars: ^, id: 61 com.example.llama I cached: 2, new_token_chars: 2, id: 17 com.example.llama I cached: +, new_token_chars: +, id: 10 com.example.llama I cached: 1, new_token_chars: 1, id: 16 com.example.llama I cached: }\, new_token_chars: }\, id: 11035 com.example.llama I cached: ge, new_token_chars: ge, id: 709 com.example.llama I cached: q, new_token_chars: q, id: 80 com.example.llama I cached: \, new_token_chars: \, id: 1124

Screenshot:

Flyfish233 commented 2 days ago

Sorry about that. Can you try adding a prompt? I'm working on a custom build with llama-android.cpp and it works fine.

This is my message parameter for the completionInit() function:

    system
You are a knowledgeable, efficient, and direct AI assistant. 
    user
How to install Microsoft C++ Build Tools
    assistant

xunuohope1107 commented 2 days ago

system You are a knowledgeable, efficient, and direct AI assistant.
user
How to install Microsoft C++ Build Tools assistant

I tried the prompt, it's better now,

But I was wondering about one thing. Does that mean each prompt must follow the template of chat complete, which include roles like system, user and assistant? Also, the output sentence often seems not complete. I tried to change the nlen from 64 to 128, but seems not work.

xunuohope1107 commented 2 days ago

When I try to build the llama.cpp on termux, the output is perfect. Not sure why the android example is quite different.

Flyfish233 commented 2 days ago

Must follow the template of chat complete

It depends on the model. Just inject these prompts before sending messages.

Not sure why the android example is quite different.

You are probably using llama-cli, not llama-android.cpp, which is not for out-of-the-box experience. This is a demo, not production ready. So I think it's reasonable. Just write your own implementation.

Also, the output sentence often seems not complete.

Try larger nLen = 2048, works on my Oneplus 12R.

Screenshot_2024-09-20-20-43-02-12_c0a2791fbe2158f00ffcfc1d12b0490a

ggerganov / llama.cpp