ubuntu@ip-172-31-72-127:~$ python3 /home/ubuntu/ChatGLM-Efficient-Tuning/src/cli_demo.py \

--checkpoint_dir /home/ubuntu/p-t-chatglm2v3_\
--model_name_or_path /home/ubuntu/chatglm2_v3\
--use_v2 \
--tokenizer_name /home/ubuntu/p-t-chatglm2v3_\
--finetuning_type p_tuning

issues

CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.6 CUDA SETUP: Detected CUDA version 118 CUDA SETUP: Loading binary /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118.so... 2023-07-19 02:43:26.106989: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-07-19 02:43:26.821407: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:07<00:00, 1.90it/s] Some weights of the model checkpoint at /home/ubuntu/chatglm2_v3 were not used when initializing ChatGLMForConditionalGeneration: ['lm_head.weight']

This IS expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /home/ubuntu/chatglm2_v3 and are newly initialized: ['transformer.prefixencoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 07/19/2023 02:43:35 - INFO - pet.core.adapter - Fine-tuning method: P-Tuning v2 07/19/2023 02:43:35 - INFO - pet.core.adapter - Loaded fine-tuned model from checkpoint(s): /home/ubuntu/p-t-chatglm2v3 trainable params: 0 || all params: 6244501504 || trainable%: 0.0000 Traceback (most recent call last): File "/home/ubuntu/ChatGLM-Efficient-Tuning/src/cli_demo.py", line 85, in main() File "/home/ubuntu/ChatGLM-Efficient-Tuning/src/cli_demo.py", line 44, in main model = dispatch_model(model, device_map) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/big_modeling.py", line 321, in dispatch_model check_device_map(model, device_map) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1067, in check_device_map raise ValueError( ValueError: The device_map provided does not give any device for the following parameters: transformer.prefix_encoder.embedding.weight

hiyouga / ChatGLM-Efficient-Tuning

p-tuning之后加载模型进行推理一直显示报错The device_map provided does not give any device for the following parameters: transformer.prefix_encoder.embedding.weight #318

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues