Open Wowoho opened 7 months ago
在微调过的llama2模型上能够完成转换和量化,但在运行模型的时候会报错
# ./llama -m ../llama2-13b-sft-filterd-v17/llama2-13b-sft-filterd-v17-inferllm-fp32.bin -g GPU --version 2 main: seed = 1709878763 total vocab length = 68419 weight tok_embeddings.weight is not match. Assert ' weight->length() == nr_number ' failed at file : /InferLLM/src/core/graph.cpp line 325 : virtual void inferllm::Graph::load(std::shared_ptr<inferllm::InputFile>, inferllm::LlmParams&, std::shared_ptr<inferllm::Vocab>), extra message: Error length of weight is mismatch.Aborted (core dumped) root@goedge_master:/InferLLM/build# ./llama -m llama2-13b-sft-filterd-v17-q4.bin -g GPU --version 2 main: seed = 1709878793 total vocab length = 68419 weight tok_embeddings.weight is not match. Assert ' weight->length() == nr_number ' failed at file : /InferLLM/src/core/graph.cpp line 325 : virtual void inferllm::Graph::load(std::shared_ptr<inferllm::InputFile>, inferllm::LlmParams&, std::shared_ptr<inferllm::Vocab>), extra message: Error length of weight is mismatch.Aborted (core dumped)
在微调过的llama2模型上能够完成转换和量化,但在运行模型的时候会报错