Closed zArche closed 3 weeks ago
确认下 ChatGLM3-6B 是否能跑,如果可以,大概率是显存不够,试一下限制一下 max_length:
>>> import chatglm_cpp
>>> pipeline = chatglm_cpp.Pipeline("../models/chatglm4-ggml.bin", max_length=2048)
>>> pipeline.chat([chatglm_cpp.ChatMessage(role="user", content="你好")])
在mac m2系统上,升级了python库:CMAKE_ARGS="-DGGML_METAL=ON" pip install -U chatglm-cpp 运行glm4:
报: GGML_ASSERT: /private/var/folders/hp/n4snp8jx0vs0dmq9165t74xr0000gn/T/pip-install-rrd53153/chatglm-cpp_777c14ecc59a4daba6aa92a07740ca19/third_party/ggml/src/ggml-metal.m:1453: false