Open sparkleholic opened 3 days ago
Hi @sparkleholic, thanks for filing an issue! We have a few inquiries about llama_v2 on the QCS8550 device. We're working on a fix and will share as soon as it's available. I'd encourage you to join our Slack Community to hear when it has been released!
I've tried many times to build HTP model for Llama_v2_7b_chat_quantized. llama_v2_7b_chat_quantized_TokenGenerator steps succeeded, However all llama_v2_7b_chat_quantized_PromptProcessor steps failed. The doubt points is the following part. This is a snippet of the compile log in ai-hub site.
In my local machine, there is proper memory to run and no limitations like described in the above log. I wonder if this failure come from the ai-hub cloud resource or not.
To Reproduce
Expected behavior Success
Stack trace If applicable, add screenshots to help explain your problem.
Host configuration: