联通的兄弟，在ollama的模型仓库上传一下，或者发一下ollama的modelfile

zhqfdn commented 5 months ago

xx025 commented 5 months ago

我是写的 Modefile , 模型可以运行起来

FROM converted.bin
#TEMPLATE "[INST] {{ .Prompt }} [/INST]"
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER num_keep 24
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

但是它似乎不是很正常(表现很差)

下面是我转换的方法

# 转换
python llm/llama.cpp/convert.py \
~/Unichat-llama3-Chinese --outtype f16 \
--outfile ~/Unichat/converted.bin \
--vocab-type bpe

# 量化
llm/llama.cpp/quantize \
~/Unichat/converted.bin \
~/Unichat/quantized.bin q4_0

# 根据 Modefile 制作Ollama 模型 
 ollama create unichat-llama3-chinese-8b -f Modelfile

zhqfdn commented 5 months ago

你转换成gguf是不是没指定分词模式？在上面哪个回复里有写

该邮件从移动设备发送

--------------原始邮件-------------- 发件人："Rycbar123 @.>; 发送时间：2024年4月26日(星期五) 晚上11:09 收件人："UnicomAI/Unichat-llama3-Chinese" @.>; 抄送："luffy @.>;"Author @.>; 主题：Re: [UnicomAI/Unichat-llama3-Chinese] 联通的兄弟，在ollama的模型仓库上传一下，或者发一下ollama的modelfile (Issue #8)

我这也写 Modefile , 模型似乎可以正常运行 FROM converted.bin #TEMPLATE "[INST] {{ .Prompt }} [/INST]" TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" PARAMETER num_keep 24 PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>"
但是它似乎不是很正常

image.png (view on web)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

zhqfdn commented 5 months ago

注意: --vocab-type 指定分词算法，默认值是 spm，联通用的是 bpe，需要指定。 python3 ./convert.py ./Unichat-llama3-Chinese-8B-28K --vocab-type bpe --outfile ./Unichat-llama3-Chinese-8B-28K_F32.gguf

上面先转成 F32格式的 GGUF，再执行量化转为F16、Q4_0 quantize ./Unichat-llama3-Chinese-8B-28K_F32.gguf ./Unichat-llama3-Chinese-8B-28K_F16.gguf F16 quantize ./Unichat-llama3-Chinese-8B-28K_F32.gguf ./Unichat-llama3-Chinese-8B-28K_Q4.gguf Q4_0

测试模型 main -m ./Unichat-llama3-Chinese-8B-28K_Q4_0.gguf -n 256 -p "百度公司"

然后在ollama 中导入模型

Modelfile 内容如下

FROM ./Unichat-llama3-Chinese-8B-28K_Q4.gguf

TEMPLATE """ {{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|> """

SYSTEM """

"""

PARAMETER repeat_penalty 1.15 PARAMETER temperature 0.6 PARAMETER top_p 1

导入模型到 Ollama 库 ollama create Unichat-llama3-Chinese-28K:8b -f Modelfile

zhqfdn commented 5 months ago

按上面导入到OLLAMA 库，我没上传到OLLAMA

xx025 commented 5 months ago

Ollama 社区下载UnicomLLM/Unichat-llama3-Chinese-8B模型

非官方, 已上传下文附加参数的更新版本

直接尝试我的在线部署

接 https://github.com/UnicomAI/Unichat-llama3-Chinese/issues/8#issuecomment-2079581026

当我按照 UnicomLLM/Unichat-llama3-Chinese-8B#快速开始的参数设定时，它似乎表现好一点

xx025 commented 5 months ago

你转换成gguf是不是没指定分词模式？在上面哪个回复里有写 … ---------- 该邮件从移动设备发送 --------------原始邮件-------------- 发件人："Rycbar123 @.>; 发送时间：2024年4月26日(星期五) 晚上11:09 收件人："UnicomAI/Unichat-llama3-Chinese" @.>; 抄送："luffy @.>;"Author @.>; 主题：Re: [UnicomAI/Unichat-llama3-Chinese] 联通的兄弟，在ollama的模型仓库上传一下，或者发一下ollama的modelfile (Issue #8) ----------------------------------- 我这也写 Modefile , 模型似乎可以正常运行 FROM converted.bin #TEMPLATE "[INST] {{ .Prompt }} [/INST]" TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" PARAMETER num_keep 24 PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>" 但是它似乎不是很正常 image.png (view on web) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

在 https://github.com/UnicomAI/Unichat-llama3-Chinese/issues/8#issuecomment-2079581026 转换方法，最后面的参数我指定了--vocab-type bpe

python llm/llama.cpp/convert.py \
~/Unichat-llama3-Chinese --outtype f16 \
--outfile ~/Unichat/converted.bin \
--vocab-type bpe

grainYao commented 5 months ago

我添加了参数--vocab-type bpe,报错FileNotFoundError: Could not find any of ['vocab.json']，这个文件是在模型文件中的吗，没找到

UnicomAI commented 5 months ago

Ollama 社区下载UnicomLLM/Unichat-llama3-Chinese-8B模型

非官方, 已上传下文附加参数的更新版本

直接尝试我的在线部署

接 #8 (comment)

当我按照 UnicomLLM/Unichat-llama3-Chinese-8B#快速开始的参数设定时，它似乎表现好一点

8b模型和llama3官方模板不一样，参考tokenizer_config.json

UnicomAI / Unichat-llama3-Chinese

联通的兄弟，在ollama的模型仓库上传一下，或者发一下ollama的modelfile #8

直接尝试我的在线部署