THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Apache License 2.0
4.35k stars 326 forks source link

Ollama运行GLM-4-9b后乱码 #333

Closed lalahaohaizi closed 18 hours ago

lalahaohaizi commented 1 month ago

System Info / 系統信息

技嘉4060Ti win11 Ollama 0.2.3

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

ollama run glm4:9b 抽取####文字后的服用保健品和服用补品的情况,并将结果输出为如下格式:[{"保健品服用情况":"","补品服用情况":""}]。无法抽取则输出为["抽取内容":"null"]。

同前。。父181母154。同前。 image

Expected behavior / 期待表现

输出完毕后停止,而没有不断输出乱码的字符

lalahaohaizi commented 1 month ago

跟modelfile的参数有关系吗 image

caoyc commented 1 month ago

win11 ollama 0.2.5

问题同上

微信图片_20240717144215

zRzRzRzRzRzRzR commented 1 month ago

0.2.3似乎ok,0.2.5有这个bug,我们有同事在复现

Iteachyou233 commented 1 month ago

ollama问题,升级至0.1.42试试

liuchuan01 commented 1 month ago

Screenshot 2024-07-22 at 11 49 48 ollama 0.2.7 ubuntu 22 3090 * 2 走api调用同样出现了这个问题

zRzRzRzRzRzRzR commented 1 month ago

得降到0.2.3

liuchuan01 commented 1 month ago

得降到0.2.3

好,谢谢啦

zRzRzRzRzRzRzR commented 1 month ago

最新的我们还在看,好像还有提到fc的问题,不过fc的问题ollama回复说比较难解决

781087595 commented 1 month ago

我用glm4是经常输出GGGGGGGGGGG

lalahaohaizi commented 1 month ago

升级到ollama3.0后可以停止,但会以GGGGG结束

Jesean commented 1 month ago

ollama版本0.2.8,返回GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

zRzRzRzRzRzRzR commented 1 month ago

0.2.3之后好像都有,要用0.2.3

wukaikailive commented 1 month ago

这些问题都遇到过,基本上重启下又能用一段时间

leizhu1989 commented 4 weeks ago

0.2.3之后好像都有,要用0.2.3

我拉取的0.2.3 版本ollama镜像,还是遇到这个问题啊

lalahaohaizi commented 3 weeks ago

请问这个问题有解决方案吗,0.2.3版本也是会输出G的,看ollama更新了几个版本没提到这个问题,求助

buhiui commented 1 week ago

请问现在有解决吗?

zRzRzRzRzRzRzR commented 1 week ago

如果是GPU ,先开启flash-attn

leizhu1989 commented 1 week ago

如果是GPU ,先开启flash-attn

请问是在ollama镜像里开启吗

lyh007 commented 1 week ago

C:\Users\80260\AppData\Local\Programs\Ollama>ollama start --help Start ollama

Usage: ollama serve [flags]

Aliases: serve, start

Flags: -h, --help help for serve

Environment Variables: OLLAMA_DEBUG Show additional debug information (e.g. OLLAMA_DEBUG=1) OLLAMA_HOST IP Address for the ollama server (default 127.0.0.1:11434) OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default "5m") OLLAMA_MAX_LOADED_MODELS Maximum number of loaded models per GPU OLLAMA_MAX_QUEUE Maximum number of queued requests OLLAMA_MODELS The path to the models directory OLLAMA_NUM_PARALLEL Maximum number of parallel requests OLLAMA_NOPRUNE Do not prune model blobs on startup OLLAMA_ORIGINS A comma separated list of allowed origins OLLAMA_SCHED_SPREAD Always schedule model across all GPUs OLLAMA_TMPDIR Location for temporary files OLLAMA_FLASH_ATTENTION Enabled flash attention OLLAMA_LLM_LIBRARY Set LLM library to bypass autodetection @leizhu1989 ollama start 命令启动时加载环境变量,我也遇到了同样的问题,还没有在服务器上测试是否可行

leizhu1989 commented 1 week ago

@lyh007 谢谢,好像可以解决问题了,目前测试正常