Open dongmovidius opened 12 months ago
Hi Yang Dong,
Currently, the LlamaForCausalLM
API does not support the llama2-70b, but it is compatible with other llama family models. You may refer to the following script to run llama2-70b model with Native INT4 format using the BigDL-LLM CLI tool:
llm-cli -t 16 -x llama -m "ggml-llama2-70b-q4_0.bin" -p PROMPT
Hi Song Ge, Should I build llm-cli from scratch or I can copy from somewhere. It looks llm-cli is NOT avoidable. I can run llm-convert as below
(bigdl_llm)` D:\bigdl-llm-tutorial-main>llm-convert
usage: llm-convert [-h] -o OUTFILE -x MODEL_FAMILY -f MODEL_FORMAT [-t OUTTYPE] [-p TMP_PATH] [-k TOKENIZER_PATH]
model
llm-convert: error: the following arguments are required: model, -o/--outfile, -x/--model-family, -f/--model-format
(bigdl_llm) D:\bigdl-llm-tutorial-main>llm-cli
'llm-cli' is not recognized as an internal or external command,
operable program or batch file.
@dongmovidius do you run this cmd in windows command prompt? llm-cli should be running in Anaconda powershell prompt.
sorry for the inconvenience, but LLAMA-2-70B native int4 requires a environment variable LLAMA_GQA, you must set it to 8
to run LLAMA-2-70B, and set it to 1 or unset it to run other LLAMA family models.
$env:LLAMA_GQA=8
set LLAMA_GQA=8
export LLAMA_GQA=8
@dongmovidius Any follow-up on this? Is this issue resolved?
System environment:
bigdl-llm version: 2.4.0b20230921 OS: windows CPU: Intel Core 13950HX/64GB memory
I'd like to run llama2 on CPU
run below code to load llama2 70b native int4 model
2023-09-22 16:32:23,278 - ERROR -
****Usage Error**** The attribute
ctx
ofLlama
object is None. 2023-09-22 16:32:23,279 - ERROR -****Call Stack***** 2023-09-22 16:32:23,279 - ERROR -
****Usage Error**** Could not load model from path: D:/llama/model/ggml-llama2-70b-q4_0.bin. Please make sure the CausalLM class matches the model you want to load.Received error The attribute
ctx
ofLlama
object is None. 2023-09-22 16:32:23,280 - ERROR -****Call Stack*****
RuntimeError Traceback (most recent call last) File D:\Program\Anaconda3\envs\bigdl_llm\lib\site-packages\bigdl\llm\transformers\modelling_bigdl.py:119, in _BaseGGMLClass.from_pretrained(cls, pretrained_model_name_or_path, native, dtype, *args, kwargs) 118 ggml_model_path = pretrained_model_name_orpath --> 119 model = class(model_path=ggml_model_path, kwargs) 120 else:
File D:\Program\Anaconda3\envs\bigdl_llm\lib\site-packages\bigdl\llm\ggml\model\llama\llama.py:211, in Llama.init(self, model_path, n_ctx, n_parts, n_gpu_layers, seed, f16_kv, logits_all, vocab_only, use_mmap, use_mlock, embedding, n_threads, n_batch, last_n_tokens_size, lora_base, lora_path, verbose) 207 self.ctx = llama_cpp.llama_init_from_file( 208 self.model_path.encode("utf-8"), self.params 209 ) --> 211 invalidInputError(self.ctx is not None, "The attribute
ctx
ofLlama
object is None.") 213 if self.lora_path:File D:\Program\Anaconda3\envs\bigdl_llm\lib\site-packages\bigdl\llm\utils\common\log4Error.py:32, in invalidInputError(condition, errMsg, fixMsg) 31 outputUserMessage(errMsg, fixMsg) ---> 32 raise RuntimeError(errMsg)
RuntimeError: The attribute
ctx
ofLlama
object is None.During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last) Cell In[4], line 4 1 #load the converted model 2 #switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models 3 from bigdl.llm.transformers import LlamaForCausalLM ----> 4 llm = LlamaForCausalLM.from_pretrained("D:/llama/model/ggml-llama2-70b-q4_0.bin", native=True)
File D:\Program\Anaconda3\envs\bigdl_llm\lib\site-packages\bigdl\llm\transformers\modelling_bigdl.py:124, in _BaseGGMLClass.from_pretrained(cls, pretrained_model_name_or_path, native, dtype, *args, *kwargs) 121 model = cls.HF_Class.from_pretrained(pretrained_model_name_or_path, 122 args, **kwargs) 123 except Exception as e: --> 124 invalidInputError( 125 False, 126 f"Could not load model from path: {pretrained_model_name_or_path}. " 127 f"Please make sure the CausalLM class matches " 128 "the model you want to load." 129 f"Received error {e}" 130 ) 131 return model
File D:\Program\Anaconda3\envs\bigdl_llm\lib\site-packages\bigdl\llm\utils\common\log4Error.py:32, in invalidInputError(condition, errMsg, fixMsg) 30 if not condition: 31 outputUserMessage(errMsg, fixMsg) ---> 32 raise RuntimeError(errMsg)
RuntimeError: Could not load model from path: D:/llama/model/ggml-llama2-70b-q4_0.bin. Please make sure the CausalLM class matches the model you want to load.Received error The attribute
ctx
ofLlama
object is