Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:04<00:00, 1.45it/s]
[INFO|modeling_utils.py:3295] 2023-10-19 06:50:00,157 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[WARNING|modeling_utils.py:3297] 2023-10-19 06:50:00,157 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /home/kings/ChatGLM and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2927] 2023-10-19 06:50:00,158 >> Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
Traceback (most recent call last):
File "/home/kings/ChatGLM/ptuning/main.py", line 411, in
main()
File "/home/kings/ChatGLM/ptuning/main.py", line 127, in main
model = model.quantize(model_args.quantization_bit)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/modeling_chatglm.py", line 1191, in quantize
self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 155, in quantize
layer.self_attention.query_key_value = QuantizedLinear(
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 139, in init
self.weight = compress_int4_weight(self.weight)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 78, in compress_int4_weight
kernels.int4WeightCompression(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 48, in call
func = self._prepare_func()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 36, in _prepare_func
curr_device = cudart.cudaGetDevice()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/library/base.py", line 72, in wrapper
raise RuntimeError("Library %s is not initialized" % self.name)
RuntimeError: Library cudart is not initialized
[2023-10-19 06:50:02,731] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 995904) of binary: /home/kings/anaconda3/envs/chatglm2/bin/python3.10
Traceback (most recent call last):
File "/home/kings/anaconda3/envs/chatglm2/bin/torchrun", line 8, in
sys.exit(main())
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
main.py FAILED
Failures:
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-10-19_06:50:02
host : my071
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 995904)
error_file:
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
### Expected Behavior
成功运行
### Steps To Reproduce
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:04<00:00, 1.45it/s]
[INFO|modeling_utils.py:3295] 2023-10-19 06:50:00,157 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[WARNING|modeling_utils.py:3297] 2023-10-19 06:50:00,157 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /home/kings/ChatGLM and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2927] 2023-10-19 06:50:00,158 >> Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
Traceback (most recent call last):
File "/home/kings/ChatGLM/ptuning/main.py", line 411, in
main()
File "/home/kings/ChatGLM/ptuning/main.py", line 127, in main
model = model.quantize(model_args.quantization_bit)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/modeling_chatglm.py", line 1191, in quantize
self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 155, in quantize
layer.self_attention.query_key_value = QuantizedLinear(
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 139, in __init__
self.weight = compress_int4_weight(self.weight)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 78, in compress_int4_weight
kernels.int4WeightCompression(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 48, in __call__
func = self._prepare_func()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 36, in _prepare_func
curr_device = cudart.cudaGetDevice()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/library/base.py", line 72, in wrapper
raise RuntimeError("Library %s is not initialized" % self.__name)
RuntimeError: Library cudart is not initialized
[2023-10-19 06:50:02,731] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 995904) of binary: /home/kings/anaconda3/envs/chatglm2/bin/python3.10
Traceback (most recent call last):
File "/home/kings/anaconda3/envs/chatglm2/bin/torchrun", line 8, in
sys.exit(main())
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-10-19_06:50:02
host : my071
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 995904)
error_file:
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
### Environment
```markdown
Linux
```
### Anything else?
无
Is there an existing issue for this?
Current Behavior
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:04<00:00, 1.45it/s] [INFO|modeling_utils.py:3295] 2023-10-19 06:50:00,157 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[WARNING|modeling_utils.py:3297] 2023-10-19 06:50:00,157 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /home/kings/ChatGLM and are newly initialized: ['transformer.prefix_encoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|modeling_utils.py:2927] 2023-10-19 06:50:00,158 >> Generation config file not found, using a generation config created from the model config. Quantized to 4 bit Traceback (most recent call last): File "/home/kings/ChatGLM/ptuning/main.py", line 411, in
main()
File "/home/kings/ChatGLM/ptuning/main.py", line 127, in main
model = model.quantize(model_args.quantization_bit)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/modeling_chatglm.py", line 1191, in quantize
self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 155, in quantize
layer.self_attention.query_key_value = QuantizedLinear(
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 139, in init
self.weight = compress_int4_weight(self.weight)
File "/home/kings/.cache/huggingface/modules/transformers_modules/ChatGLM/quantization.py", line 78, in compress_int4_weight
kernels.int4WeightCompression(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 48, in call
func = self._prepare_func()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/kernels/base.py", line 36, in _prepare_func
curr_device = cudart.cudaGetDevice()
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/cpm_kernels/library/base.py", line 72, in wrapper
raise RuntimeError("Library %s is not initialized" % self.name)
RuntimeError: Library cudart is not initialized
[2023-10-19 06:50:02,731] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 995904) of binary: /home/kings/anaconda3/envs/chatglm2/bin/python3.10
Traceback (most recent call last):
File "/home/kings/anaconda3/envs/chatglm2/bin/torchrun", line 8, in
sys.exit(main())
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/kings/anaconda3/envs/chatglm2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
main.py FAILED
Failures: