Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.
https://llamafile.ai
Other
20.53k stars 1.03k forks source link

(For "history": bench with llamafile V0.8.6) #569

Open invisiblepancake opened 1 month ago

invisiblepancake commented 1 month ago

(For "history": bench with llamafile V0.8.6)

Some result with my zen3+128Go of RAM / Linux (fc40)
RAM: DDR4@3600

./llamafile-bench-0.8.6 -p "256,512,1024" -m "Mistral-7b-instruct-v0.2.Q6_K.llamafile,Mistral-7b-instruct-v0.2.Q8_0.llamafile,Mistral-7b-instruct-v0.2.F16.llamafile,Mistral-7b-instruct-v0.2.BF16.llamafile,Mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile,mixtral-8x7b-instruct-v0.1.Q6_K.llamafile,Mixtral-8x7b-instruct-v0.1.BF16.llamafile,Mixtral-8x22B-Instruct-v0.1.Q5_K_M.llamafile,Mixtral-8x22B-Instruct-v0.1.Q6_K.llamafile"
cpu_info model_filename size test t/s
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q6_K 107.61 GiB pp256 19.94
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q6_K 107.61 GiB pp512 19.66
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q6_K 107.61 GiB pp1024 19.37
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q6_K 107.61 GiB tg16 1.57
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q5_K_M 93.11 GiB pp256 18.76
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q5_K_M 93.11 GiB pp512 18.57
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q5_K_M 93.11 GiB pp1024 18.27
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x22B-Instruct-v0.1.Q5_K_M 93.11 GiB tg16 1.82
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB pp256 29.11
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB pp512 29.97
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB pp1024 29.73
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB tg16 1.98
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB pp256 60.74
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB pp512 60.11
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB pp1024 58.90
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB tg16 4.76
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB pp256 58.60
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB pp512 56.64
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB pp1024 55.94
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB tg16 5.42
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB pp256 59.24
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB pp512 58.34
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB pp1024 56.91
AMD Ryzen 9 5950X 16-Core Processor (znver3) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB tg16 6.23
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.BF16 13.49 GiB pp256 52.92
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.BF16 13.49 GiB pp512 51.14
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.BF16 13.49 GiB pp1024 50.48
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.BF16 13.49 GiB tg16 3.53
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.F16 13.49 GiB pp256 56.89
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.F16 13.49 GiB pp512 56.87
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.F16 13.49 GiB pp1024 56.10
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.F16 13.49 GiB tg16 3.52
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q8_0 7.17 GiB pp256 72.17
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q8_0 7.17 GiB pp512 70.61
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q8_0 7.17 GiB pp1024 69.41
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q8_0 7.17 GiB tg16 6.57
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q6_K 5.53 GiB pp256 113.94
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q6_K 5.53 GiB pp512 109.37
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q6_K 5.53 GiB pp1024 106.53
AMD Ryzen 9 5950X 16-Core Processor (znver3) Mistral-7b-instruct-v0.2.Q6_K 5.53 GiB tg16 8.50

As you see matmul is memory limited on this CPU (DDR-4 + zne3)

Originally posted by @Djip007 in https://github.com/Mozilla-Ocho/llamafile/discussions/450