Open empty2enrich opened 1 year ago
you should pay atention to difference of the prompt building process between original python code and fastlllm code.
for llama2-7b
:
fastllm_lib.launch_response_str_llm_model()
.
or you can config the following parts in config.json
and re-convert the model.
"pre_prompt": "",
"user_role": "",
"bot_role": "",
"history_sep": "",
for llama2-7b-chat-hf
,
config.json
and re-convert the model.
"pre_prompt": "",
"user_role": "[INST] ",
"bot_role": " [/INST]",
"history_sep": " ",
I convert llama2-7b using
fastllm_pytools.torch2flm
, The inference result looks wrong, Also inconsistent with inference results using llama2-7b directly:prompt: The president of the United States is
generate result: