Open ZhaoqiongZ opened 10 months ago
same result also got on Arc770 with LLama2 7B
(zzq_py39) a770@RPLP-A770:~/zhaoqion/zhaoqion$ python generate.py --repo-id-or-model-path models--meta-llama--Llama-2-7b-hf/snapshots/6fdf2e60f86ff2481f2241aaee459f85b5b0bbb9/
/home/a770/miniconda3/envs/zzq_py39/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
Loading checkpoint shards: 100%|██████████████████████████████████████| 2/2 [00:00<00:00, 26.87it/s]
2023-11-17 14:34:15,348 - bigdl.llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
Inference time: 0.612342119216919 s
-------------------- Prompt --------------------
<s>[INST] <<SYS>>
<</SYS>>
What is AI? [/INST]
-------------------- Output --------------------
[INST] <<SYS>>
<</SYS>>
What is AI? [/INST]
[/INST]
[/INST]
[/INST]
[/INST]
[/INST]
@chtanch Please check if you can reproduce this issue.
I reproduced the issue for Arc770 with Llama-2-7b-hf.
For this script, please use 'chat' versions, which are fine-tuned with [INST] and \<\
Hi @chtanch , thanks for your advice! Will Llama-2-7b-hf works well with other script ?
Since the prompt templates are for chat models only, the base model seems to have no prompt structure. Can you try removing the prompt to see the result?
Hi @chtanch , thanks for your advice! Will Llama-2-7b-hf works well with other script ?
You can try scripts/code that do not add the special tokens [INST], [/INST], \<\>, \<\
the script I use is https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2/generate.py with model Llama-2-70b-hf , the output sometimes is empty.
Here are some output examples: output1
output2
output3