Open songkq opened 1 year ago
@simonJJJ Hi, could you please give some advice for this issue? Qwen-7B-Q4_0 works well on Mac M1, but Qwen-7B-Q8_0 cannot.
cmake -B build -DGGML_METAL=ON && cmake --build build -j ./main -m ../../ggml_bins/qwen7b-chat-8k-ggml-q4_0.bin --tiktoken ../../assets/qwen.tiktoken -v -p 介绍下三国演义 system info: | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | inference config: | max_length = 2048 | max_context_length = 512 | top_k = 0 | top_p = 0.5 | temperature = 0.95 | num_threads = 0 | loaded qwen model from ../../ggml_bins/qwen7b-chat-8k-ggml-q4_0.bin within: 88.669 ms 《三国演义》是中国古代四大名著之一,由罗贯中创作。它讲述了从东汉末年到西晋初年之间,中国历史上著名的三国时期的故事。三国时期是中国历史上一个非常重要的时期,它涉及到政治、军事、文化、经济等多个方面,也出现了许多著名的英雄人物,如曹操、刘备、孙权等。《三国演义》以三国时期的历史事件为基础,通过一系列精彩的故事,描述了当时的政治、军事、文化、经济等方面的情况,也展示了当时人们的思想、情感和行为。 prompt time: 5496.2 ms / 24 tokens (229.008 ms/token) output time: 3756.11 ms / 117 tokens (32.103 ms/token) total time: 9252.31 ms ./main -m ../../ggml_bins/qwen7b-chat-8k-ggml-q8_0.bin --tiktoken ../../assets/qwen.tiktoken -v -p 介绍下三国演义 system info: | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | inference config: | max_length = 2048 | max_context_length = 512 | top_k = 0 | top_p = 0.5 | temperature = 0.95 | num_threads = 0 | loaded qwen model from ../../ggml_bins/qwen7b-chat-8k-ggml-q8_0.bin within: 87.001 ms GGML_ASSERT: /workspace/qwen.cpp/third_party/ggml/src/ggml-metal.m:1453: false [1] 12416 abort ./main -m ../../ggml_bins/qwen7b-chat-8k-ggml-q8_0.bin --tiktoken -v -p
Hi, @songkq , 也許你可以嘗試看看我的PR #41 ,裡面有一些實驗數據。 /workspace/qwen.cpp/third_party/ggml/src/ggml-metal.m:1453: false 應該是觸發 OOM 了
/workspace/qwen.cpp/third_party/ggml/src/ggml-metal.m:1453: false
OOM
@simonJJJ Hi, could you please give some advice for this issue? Qwen-7B-Q4_0 works well on Mac M1, but Qwen-7B-Q8_0 cannot.