Closed amostamu closed 3 months ago
ane 7.5w
@amostamu Wow! Thanks for sharing. That looks like it is on the main
branch, correct?
Try switching to the sequoia
branch and running this:
swift run -c release LLMCLI --repo-id smpanaro/Llama-2-7b-coreml --repo-directory sequoia --max-new-tokens 80
Hopefully it’s even faster. 🤞
Yes it is even faster when using sequoia branch. That is speed of second time Compile + Load: 0.51 sec Prompt : 726.74 sec 704.52 token / sec Generate : 85.41 +/- 2.67 ms / token 11.72 +/- 0.29 token / sec
Cool, thanks for trying. I expected/hoped it would be a bit higher but I'll take it.
Added this to the README. Thanks again for the info!
m3 pro gain a big performance boost on macOS 15