Open Fred-cell opened 1 year ago
@cyita Take a look?
Hi Fred, I cannot reproduce this error, it seems this issue is related to the tokenizer
bigdl-core-xe 2.4.0b20231101 pypi_0 pypi
bigdl-core-xe-esimd 2.4.0b20231101 pypi_0 pypi
bigdl-llm 2.4.0b20231101 pypi_0 pypi
source /opt/intel/oneapi/setvars.sh
export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 ENABLE_SDP_FUSION=1
numactl -C 0-4 -m 0 python llama_benchmark.py
I have given you my environment, you can reproduced it once again.
You can raise the system open file limit using
ulimit -n 2048
when input prompt is 256, error is as below:
<class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565
=========First token cost 6.1586 s=========
=========Rest tokens cost average 0.0276 s (1023 tokens in all)=========
=========First token cost 0.2373 s=========
Traceback (most recent call last):
File "/home/fred/LLM/text-generation/bigdl-llm/BigDL-bk/python/llm/example/gpu/hf-transformers-models/llama2/generate.py", line 92, in
Can you try unset this environment variable? SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS
Any update on this issue? @Fred-cell
bigdl-core-xe 2.4.0b20231101 bigdl-core-xe-esimd 2.4.0b20231101 bigdl-llm 2.4.0b20231101
]# numactl -C 0-4 -m 0 python generate.py --repo-id-or-model-path ./pretrained-model/llama2-7b-half/ --n-predict 1024 --prompt "Once upon a time, there existed a little girl who liked to have adventures. She wanted to go to places and meet new people, and have fun" Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:06<00:00, 3.22s/it] 2023-11-01 22:27:43,883 - bigdl.llm.transformers.utils - INFO - Converting the current model to fp8 format...... Traceback (most recent call last): File "/home/BigDL/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2/generate.py", line 66, in
File "/root/anaconda3/envs/bigdl-llm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1841, in from_pretrained
File "/root/anaconda3/envs/bigdl-llm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1891, in _from_pretrained
OSError: [Errno 24] Too many open files: './pretrained-model/llama2-7b-half/tokenizer_config.json'