Closed taeyeonlee closed 3 months ago
Conversation to be continued on Github, here https://github.com/quic/ai-hub-models/issues/85. We kindly request you file one issue across Slack, AI Hub Apps and AI Hub Models Github repos as these are all maintained by the AI Hub team. We will do our best to get to your questions as quickly as possible.
Hi, It fails to create context from binary for the 4 QNN Context Bin files (llama_v2_7b_chat_quantized_TokenGenerator_1_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_2_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_3_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_4_Quantized.bin), in the Android mobile S24 Ultra. even though it succeed to create context from binary One by One, it means that it succeed to create context from binary for one llama_v2_7b_chat_quantized_TokenGenerator_1_Quantized.bin, and executeGraphs it and freeContext, then create context from binary for one file llama_v2_7b_chat_quantized_TokenGenerator_2_Quantized.bin, and executeGraphs it and freeContext, and then until llama_v2_7b_chat_quantized_TokenGenerator_4_Quantized.bin, and executeGraphs it and freeContext.
How to create context from binary for the 4 QNN Context Bin files (llama_v2_7b_chat_quantized_TokenGenerator_1_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_2_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_3_Quantized.bin, llama_v2_7b_chat_quantized_TokenGenerator_4_Quantized.bin) ?
The fail log is following.
2024-08-07 17:34:36.233 9745-9772 LlamaNative com.test.llama I [taeyeon] QNN System function pointers are populated 2024-08-07 17:34:36.233 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to create system handle. m_qnnFunctionPointers.qnnSystemInterface.systemContextCreate 2024-08-07 17:34:36.233 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to allocate memory. bufferSize = 1083146928 2024-08-07 17:34:37.441 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to read binary data = /data/local/tmp/sample_app/llama_v2_7b_chat_quantized_TokenGenerator_1_Quantized.bin 2024-08-07 17:34:37.441 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to get context binary info. binaryInfoSize = 0 2024-08-07 17:34:37.441 9745-9772 LlamaNative com.test.llama I Extracting graphsInfo for graph Idx: 0 2024-08-07 17:34:37.441 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 516 2024-08-07 17:34:37.442 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 513 2024-08-07 17:34:37.442 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to copy metadata. m_graphsCount[4] = 1 2024-08-07 17:34:37.446 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 1, latency 100 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:37.447 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:1872: manage_poll_qos: poll mode updated to 3 for domain 3, handle 0x7cfa118450 for timeout 9999 2024-08-07 17:34:37.447 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 3, latency 9999 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:37.757 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to create context from binary. 2024-08-07 17:34:37.757 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmp0b0a0nm3 2024-08-07 17:34:37.757 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmp0b0a0nm3 2024-08-07 17:34:37.832 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to allocate memory. bufferSize = 821006960 2024-08-07 17:34:38.711 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to read binary data = /data/local/tmp/sample_app/llama_v2_7b_chat_quantized_TokenGenerator_2_Quantized.bin 2024-08-07 17:34:38.711 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to get context binary info. binaryInfoSize = 0 2024-08-07 17:34:38.711 9745-9772 LlamaNative com.test.llama I Extracting graphsInfo for graph Idx: 0 2024-08-07 17:34:38.711 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 516 2024-08-07 17:34:38.711 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 513 2024-08-07 17:34:38.712 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to copy metadata. m_graphsCount[5] = 1 2024-08-07 17:34:38.716 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 1, latency 100 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:38.716 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:1872: manage_poll_qos: poll mode updated to 3 for domain 3, handle 0x7cfa118450 for timeout 9999 2024-08-07 17:34:38.716 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 3, latency 9999 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:39.218 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to create context from binary. 2024-08-07 17:34:39.218 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmp5a2tztgk 2024-08-07 17:34:39.218 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmp5a2tztgk 2024-08-07 17:34:39.278 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to allocate memory. bufferSize = 821002864 2024-08-07 17:34:40.172 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to read binary data = /data/local/tmp/sample_app/llama_v2_7b_chat_quantized_TokenGenerator_3_Quantized.bin 2024-08-07 17:34:40.172 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to get context binary info. binaryInfoSize = 0 2024-08-07 17:34:40.172 9745-9772 LlamaNative com.test.llama I Extracting graphsInfo for graph Idx: 0 2024-08-07 17:34:40.173 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 516 2024-08-07 17:34:40.173 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 513 2024-08-07 17:34:40.173 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to copy metadata. m_graphsCount[6] = 1 2024-08-07 17:34:40.177 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 1, latency 100 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:40.178 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:1872: manage_poll_qos: poll mode updated to 3 for domain 3, handle 0x7cfa118450 for timeout 9999 2024-08-07 17:34:40.178 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 3, latency 9999 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:40.722 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to create context from binary. 2024-08-07 17:34:40.722 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmpxdxtl6kr 2024-08-07 17:34:40.722 9745-9772 LlamaNative com.test.llama I [taeyeon] graphRetrieve graphName: tmpxdxtl6kr 2024-08-07 17:34:40.783 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to allocate memory. bufferSize = 952636560 2024-08-07 17:34:41.809 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to read binary data = /data/local/tmp/sample_app/llama_v2_7b_chat_quantized_TokenGenerator_4_Quantized.bin 2024-08-07 17:34:41.810 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to get context binary info. binaryInfoSize = 0 2024-08-07 17:34:41.810 9745-9772 LlamaNative com.test.llama I Extracting graphsInfo for graph Idx: 0 2024-08-07 17:34:41.810 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 516 2024-08-07 17:34:41.810 9745-9772 LlamaNative com.test.llama I Extracting tensorInfo for tensor tensorsCount : 513 2024-08-07 17:34:41.810 9745-9772 LlamaNative com.test.llama I [taeyeon] succeed to copy metadata. m_graphsCount[7] = 1 2024-08-07 17:34:41.814 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 1, latency 100 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:41.815 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:1872: manage_poll_qos: poll mode updated to 3 for domain 3, handle 0x7cfa118450 for timeout 9999 2024-08-07 17:34:41.815 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2331: remote_handle_control_domain: requested QOS 3, latency 9999 for domain 3 handle 0x7cfa118450 2024-08-07 17:34:42.300 9745-9772 com.test.llama com.test.llama E vendor/qcom/proprietary/adsprpc/src/fastrpc_mem.c:511: Error 0x1: fastrpc_mmap failed to map buffer fd 178, addr 0x79851b2000, length 0x38a00000, domain 3, flags 0x3, ioctl ret 0xffffffff, errno Bad address 2024-08-07 17:34:42.304 9745-9772 com.test.llama com.test.llama I vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:3407: open_device_node: no access to default device of domain 3, open thru HAL, (sess_id 2) 2024-08-07 17:34:42.305 9745-9772 dsp-client com.test.llama E DspClient.cpp (127): Error: open_hal_session: invalid argument(s): client instance 0x7ea9f89530, domain 11 2024-08-07 17:34:42.305 9745-9772 com.test.llama com.test.llama E vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:3426: Error 0x0: open_device_node failed for domain ID 11, sess ID 2 (errno 13, Permission denied) 2024-08-07 17:34:42.305 9745-9772 com.test.llama com.test.llama E vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2088::Error: 0x200: 0 <= (dev = open_device_node((int)domain)) 2024-08-07 17:34:42.305 9745-9772 com.test.llama com.test.llama W vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2145:Warning 0x200: remote_get_info failed to get attribute 1 for domain 11 (errno Permission denied) 2024-08-07 17:34:42.305 9745-9772 com.test.llama com.test.llama E vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2371: Error 0x200: remote_handle_control_domain failed for request ID 2 on domain 3 (errno Permission denied) 2024-08-07 17:34:42.305 9745-9772 com.test.llama com.test.llama E vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:2382: Error 0x200: remote_handle_control failed for request ID 2 (errno Permission denied) 2024-08-07 17:34:42.308 9745-9772 LlamaNative com.test.llama E Could not create context from binary. 2024-08-07 17:34:42.308 9745-9772 LlamaNative com.test.llama E Cleaning up graph Info structures. 2024-08-07 17:34:42.382 9745-9772 LlamaNative com.test.llama E ERROR Create From Binary failure
Best regards,