THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.73k stars 1.85k forks source link

[BUG/Help] <NameError: name 'round_up' is not defined> #350

Open SimonFungC opened 1 year ago

SimonFungC commented 1 year ago

Is there an existing issue for this?

Current Behavior

ptuning 执行训练时报错 bash train.sh

# bash train.sh
master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified.
07/20/2023 10:27:41 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
07/20/2023 10:27:41 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=16,
gradient_checkpointing=False,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=0.02,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=output/adgen-chatglm2-6b-pt-128-2e-2/runs/Jul20_10-27-41_dsw-69238-658d7665d8-4svjr,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=10,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=3000,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_hf,
optim_args=None,
output_dir=output/adgen-chatglm2-6b-pt-128-2e-2,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=1,
per_device_train_batch_size=1,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard', 'wandb'],
resume_from_checkpoint=None,
run_name=output/adgen-chatglm2-6b-pt-128-2e-2,
save_on_each_node=False,
save_safetensors=False,
save_steps=1000,
save_strategy=steps,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
sortish_sampler=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
07/20/2023 10:27:42 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-38ff07e29e92a00b/0.0.0/fe5dd6ea2639a6df622901539cb550cf8797e5a6b2dd7af1cf934bed8e233e6e)
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 155.69it/s]
[INFO|configuration_utils.py:710] 2023-07-20 10:27:42,585 >> loading configuration file /mnt/workspace/chatglm2-6b/config.json
[INFO|configuration_utils.py:710] 2023-07-20 10:27:42,588 >> loading configuration file /mnt/workspace/chatglm2-6b/config.json
[INFO|configuration_utils.py:768] 2023-07-20 10:27:42,589 >> Model config ChatGLMConfig {
  "_name_or_path": "/mnt/workspace/chatglm2-6b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration"
  },
  "bias_dropout_fusion": true,
  "eos_token_id": 2,
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1e-05,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 28,
  "original_rope": true,
  "pad_token_id": 0,
  "padded_vocab_size": 65024,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "quantization_bit": 0,
  "rmsnorm": true,
  "seq_length": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.31.0",
  "use_cache": true,
  "vocab_size": 65024
}

[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file tokenizer_config.json
[INFO|modeling_utils.py:2600] 2023-07-20 10:27:42,772 >> loading weights file /mnt/workspace/chatglm2-6b/pytorch_model.bin.index.json
[INFO|configuration_utils.py:599] 2023-07-20 10:27:42,773 >> Generate config GenerationConfig {
  "_from_model_config": true,
  "eos_token_id": 2,
  "pad_token_id": 0,
  "transformers_version": "4.31.0"
}

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [01:09<00:00,  9.90s/it]
[INFO|modeling_utils.py:3329] 2023-07-20 10:28:52,223 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[WARNING|modeling_utils.py:3331] 2023-07-20 10:28:52,223 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /mnt/workspace/chatglm2-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2949] 2023-07-20 10:28:52,225 >> Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
07/20/2023 10:28:52 - WARNING - transformers_modules.chatglm2-6b.quantization - Failed to load cpm_kernels:CUDA Runtime Error: CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
  File "/mnt/workspace/ChatGLM2-6B/ptuning/main.py", line 411, in <module>
    main()
  File "/mnt/workspace/ChatGLM2-6B/ptuning/main.py", line 127, in main
    model = model.quantize(model_args.quantization_bit)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 1191, in quantize
    self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 155, in quantize
    layer.self_attention.query_key_value = QuantizedLinear(
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 139, in __init__
    self.weight = compress_int4_weight(self.weight)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 76, in compress_int4_weight
    blockDim = (min(round_up(m, 32), 1024), 1, 1)
NameError: name 'round_up' is not defined
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 321) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 794, in main
    run(args)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-07-20_10:28:57
  host      : dsw-69238-658d7665d8-4svjr
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 321)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Expected Behavior

No response

Steps To Reproduce

bash train.sh

Environment

- OS:Ubuntu 22.04.1 LTS
- Python:3.10.6
- Transformers:4.31.0
- PyTorch:2.0.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

miiiiiko commented 1 year ago

遇到了一样的问题

miiiiiko commented 1 year ago

解决了,要么取消--quantization_bit 4, 要么pip install cpm_kernels

fatmind commented 1 year ago

这是什么原因? install cpm_kernels 还是不行,去掉量化 --quantization_bit 4,就可以了

环境:Ubuntu、PyTorch:2.0.1、NVIDIA T4

woodywang0 commented 4 months ago

我也遇到一样的问题,开始是显卡驱动太老了。更新显卡驱动后报这个错误。去掉量化 --quantization_bit 4不行,更新nstall cpm_kernels ,这个问题不报错了,抛了个pytorch-runtime异常,没办法只能升级了 Ubuntu系统到22.04,然后更新pytorch到2.1版本,python是3.11版本。 就没报错了。感觉是Ubuntu显卡驱动和pytorch版本需要兼容。不知道对不对