Closed hyperbolic-c closed 3 months ago
When I use benchmark it can return correctly
[ 92% ] load ./models/llama3//block_29.mnn model ... Done!
[ 95% ] load ./models/llama3//block_30.mnn model ... Done!
[ 97% ] load ./models/llama3//block_31.mnn model ... Done!
prompt file is ./resource/prompt.txt
### warmup ... Done
It's great to chat with you! How are you doing today?
哈哈!我是 ChatGPT,一个人工智能语言模型!
I'm just an AI, I don't have access to real-time weather information. However, you can check the weather forecast online or on your local weather app to get an idea of the current weather conditions.
#################################
prompt tokens num = 54
decode tokens num = 77
prefill time = 3.85 s
decode time = 12.91 s
prefill speed = 14.02 tok/s
decode speed = 5.96 tok/s
##################################
It looks like llama3 only can response
with llm->response(prompts[i])
, not chat
with llm->chat()
?
@wangzhaode Do you have any suggestions, please!
Marking as stale. No activity in 30 days.
when I run the llama3 mnn model
then to ask it returns
Any solution? Thanks !!