Closed glbrighenti closed 4 months ago
Is there more logs in addition to Load failed:20
? There is a config in cmake to enable logging
I already built executorch with
-DEXECUTORCH_ENABLE_LOGGING=1 \
but I dont see anything relevant on Logcat, just typical lifecycle stuff:
from logcat:
2023-08-30 09:15:36.852 5338-5338 WindowOnBackDispatcher com.example.executorchllamademo W sendCancelIfRunning: isInProgress=falsecallback=android.view.ViewRootImpl$$ExternalSyntheticLambda17@5505a56
2023-08-30 09:15:36.852 2761-3521 CoreBackPreview system_server D Window{94205d6 u0 com.example.executorchllamademo/com.example.executorchllamademo.MainActivity}: Setting back callback null
2023-08-30 09:15:36.860 2761-9524 InputManager-JNI system_server W Input channel object '94205d6 com.example.executorchllamademo/com.example.executorchllamademo.MainActivity (client)' was disposed without first being removed with the input manager!
2023-08-30 09:15:36.879 5338-5338 InputEventReceiver com.example.executorchllamademo W Attempted to finish an input event but the input event receiver has already been disposed.
2023-08-30 09:15:36.882 2761-9524 CoreBackPreview system_server D Window{419970f u0 com.example.executorchllamademo/com.example.executorchllamademo.MainActivity}: Setting back callback OnBackInvokedCallbackInfo{mCallback=android.window.IOnBackInvokedCallback$Stub$Proxy@f3d2321, mPriority=0, mIsAnimationCallback=false}
2023-08-30 09:15:36.892 5338-5366 qdgralloc com.example.executorchllamademo W getInterlacedFlag: getMetaData returned 3, defaulting to interlaced_flag = 0
2023-08-30 09:15:36.895 2761-9524 ImeTracker system_server I com.example.executorchllamademo:eb998f3: onRequestHide at ORIGIN_SERVER_HIDE_INPUT reason HIDE_UNSPECIFIED_WINDOW
2023-08-30 09:15:36.895 2761-9524 ImeTracker system_server I com.example.executorchllamademo:eb998f3: onCancelled at PHASE_SERVER_SHOULD_HIDE
2023-08-30 09:15:36.897 4263-4263 GoogleInpu...hodService com...gle.android.inputmethod.latin I GoogleInputMethodService.onStartInput():1917 onStartInput(EditorInfo{inputType=0x0(NULL) imeOptions=0x0 privateImeOptions=null actionName=UNSPECIFIED actionLabel=null actionId=0 initialSelStart=-1 initialSelEnd=-1 initialCapsMode=0x0 hintText=null label=null packageName=com.example.executorchllamademo fieldId=0 fieldName=null extras=null hintLocales=[]}, false)
2023-08-30 09:15:36.897 5338-5366 qdgralloc com.example.executorchllamademo W getInterlacedFlag: getMetaData returned 3, defaulting to interlaced_flag = 0
2023-08-30 09:15:36.915 5338-5366 qdgralloc com.example.executorchllamademo W getInterlacedFlag: getMetaData returned 3, defaulting to interlaced_flag = 0
2023-08-30 09:15:37.482 5338-5338 WindowOnBackDispatcher com.example.executorchllamademo W sendCancelIfRunning: isInProgress=falsecallback=android.view.ViewRootImpl$$ExternalSyntheticLambda17@7628b8e
2023-08-30 09:15:37.483 2761-9524 CoreBackPreview system_server D Window{419970f u0 com.example.executorchllamademo/com.example.executorchllamademo.MainActivity}: Setting back callback null
2023-08-30 09:15:37.488 2761-27299 InputManager-JNI system_server W Input channel object '419970f com.example.executorchllamademo/com.example.executorchllamademo.MainActivity (client)' was disposed without first being removed with the input manager!
2023-08-30 09:15:37.491 5338-5411 libc com.example.executorchllamademo W Access denied finding property "ro.hardware.chipname"
2023-08-30 09:15:37.493 5338-5338 InputEventReceiver com.example.executorchllamademo W Attempted to finish an input event but the input event receiver has already been disposed.
2023-08-30 09:15:37.507 2761-9524 ImeTracker system_server I com.example.executorchllamademo:5ce05139: onRequestHide at ORIGIN_SERVER_HIDE_INPUT reason HIDE_SAME_WINDOW_FOCUSED_WITHOUT_EDITOR
2023-08-30 09:15:37.507 2761-9524 ImeTracker system_server I com.example.executorchllamademo:5ce05139: onCancelled at PHASE_SERVER_SHOULD_HIDE
2023-08-30 09:15:37.508 4263-4263 GoogleInpu...hodService com...gle.android.inputmethod.latin I GoogleInputMethodService.onStartInput():1917 onStartInput(EditorInfo{inputType=0x0(NULL) imeOptions=0x0 privateImeOptions=null actionName=UNSPECIFIED actionLabel=null actionId=0 initialSelStart=-1 initialSelEnd=-1 initialCapsMode=0x0 hintText=null label=null packageName=com.example.executorchllamademo fieldId=0 fieldName=null extras=null hintLocales=[]}, false)
2023-08-30 09:15:40.395 2761-27299 CoreBackPreview system_server D Window{41ccca6 u0 com.example.executorchllamademo/com.example.executorchllamademo.MainActivity}: Setting back callback OnBackInvokedCallbackInfo{mCallback=android.window.IOnBackInvokedCallback$Stub$Proxy@c569400, mPriority=0, mIsAnimationCallback=false}
2023-08-30 09:15:40.401 5338-5366 qdgralloc com.example.executorchllamademo W getInterlacedFlag: getMetaData returned 3, defaulting to interlaced_flag = 0
Could you try the adb shell option just for sanity check before using the Android app?
Also wonder if running the model on host work
Didn't try on the host machine, but its running well on on adb shell (log on galaxy s24 posted below)
Any idea what code 20 stands for? Its coming on the load model native part, but I cant seem to find any reference to what this errors stands for.
130|e3q:/data/local/tmp/llama $ ./llama_main -model_path llama.pte --tokenizer_path tokenizer.model --prompt \"Once upon a time\" --seq_len 120
I 00:00:00.004786 executorch:cpuinfo_utils.cpp:62] Reading file /sys/devices/soc0/image_version
I 00:00:00.005151 executorch:cpuinfo_utils.cpp:78] Failed to open midr file /sys/devices/soc0/image_version
I 00:00:00.005199 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu0/regs/identification/midr_el1
I 00:00:00.005312 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu1/regs/identification/midr_el1
I 00:00:00.005370 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu2/regs/identification/midr_el1
I 00:00:00.005412 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu3/regs/identification/midr_el1
I 00:00:00.005511 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu4/regs/identification/midr_el1
I 00:00:00.005548 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu5/regs/identification/midr_el1
I 00:00:00.005584 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu6/regs/identification/midr_el1
I 00:00:00.005639 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu7/regs/identification/midr_el1
I 00:00:00.005682 executorch:main.cpp:65] Resetting threadpool with num threads = 6
I 00:00:00.015067 executorch:runner.cpp:54] Creating LLaMa runner: model_path=llama.pte, tokenizer_path=tokenizer.model
I 00:00:05.098464 executorch:runner.cpp:69] Reading metadata from model
I 00:00:05.098553 executorch:runner.cpp:121] get_n_bos: 1
I 00:00:05.098565 executorch:runner.cpp:121] get_n_eos: 1
I 00:00:05.098568 executorch:runner.cpp:121] get_max_seq_len: 128
I 00:00:05.098577 executorch:runner.cpp:121] use_kv_cache: 1
I 00:00:05.098580 executorch:runner.cpp:121] use_sdpa_with_kv_cache: 1
I 00:00:05.098582 executorch:runner.cpp:121] append_eos_to_prompt: 0
I 00:00:05.158840 executorch:runner.cpp:121] get_vocab_size: 128256
I 00:00:05.158867 executorch:runner.cpp:121] get_bos_id: 128000
I 00:00:05.158876 executorch:runner.cpp:121] get_eos_id: 128001
"Once Upon a Time in Hollywood" is a 2019 American comedy-drama film written and directed by Quentin Tarantino. The film takes place in 1969 Los Angeles and follows the story of Rick Dalton, a faded television actor, and his stunt double and close friend, Cliff Booth, as they navigate the changing landscape of Hollywood and the counterculture movement.
The film features an all-star cast, including Leonardo DiCaprio as Rick Dalton, Brad Pitt as Cliff Booth, Margot Robbie as Sharon Tate, Emile Hirsch as Jay Sebring, and Al Pacino as
PyTorchObserver {"prompt_tokens":3,"generated_tokens":116,"model_load_start_ms":1719242523653,"model_load_end_ms":1719242528797,"inference_start_ms":1719242528797,"inference_end_ms":1719242540236,"prompt_eval_end_ms":1719242529160,"first_token_ms":1719242529265,"aggregate_sampling_time_ms":118,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
I 00:00:16.598320 executorch:runner.cpp:402] Prompt Tokens: 3 Generated Tokens: 116
I 00:00:16.598324 executorch:runner.cpp:408] Model Load Time: 5.144000 (seconds)
I 00:00:16.598338 executorch:runner.cpp:418] Total inference time: 11.439000 (seconds) Rate: 10.140747 (tokens/second)
I 00:00:16.598341 executorch:runner.cpp:426] Prompt evaluation: 0.363000 (seconds) Rate: 8.264463 (tokens/second)
I 00:00:16.598343 executorch:runner.cpp:437] Generated 116 tokens: 11.076000 (seconds) Rate: 10.473095 (tokens/second)
I 00:00:16.598345 executorch:runner.cpp:445] Time to first generated token: 0.468000 (seconds)
I 00:00:16.598347 executorch:runner.cpp:452] Sampling time over 119 tokens: 0.118000 (seconds)
Thanks for checking it out. Error code definition can be found here. Sounds like there are some issues on Android side.
I got the problem fixed, by not using the pre-built AAR library (downloaded using _download_prebuiltlib.sh
) , but instead I compiled the native code with
./gradlew :app:setup.
I am not sure what incompatible with the pre-built library, maybe its because my host machine is Linux, and setup mentions that Mac is preferred.
I was able to quantize and build llamma3 for Android The model is working inside the shell using the generated _llama3_kv_sdpa_xnn_qe_432.pte and the original tokenizer.model
However when trying to use the sample LlammaDemo App, I get the "Load failed:20" after pointing to the same files it worked inside of the shell
I used the the script to download the AAR prebuild (download_prebuilt_lib.sh)
any ideas?