/home/ma-user/anaconda3/envs/chatglmeV2/bin/python /home/ma-user/work/zhanghongjun/ChatGLM-Efficient-Tuning-main/src/train_ppo.py --do_train --dataset alpaca_gpt4_en --finetuning_type lora --checkpoint_dir path_to_sft_checkpoint --reward_model path_to_rm_checkpoint --output_dir path_to_ppo_checkpoint --per_device_train_batch_size 2 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 1e-5 --num_train_epochs 1.0 --fp16 /home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/requests/init.py:104: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (5.0.0)/charset_normalizer (2.0.12) doesn't match a supported version! RequestsDependencyWarning)

issues

/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/extras/CUPTI/lib64'), PosixPath('/usr/local/nvidia/lib')} warn(msg) CUDA SETUP: CUDA version lower than 11 are currently not supported for LLM.int8(). You will be only to use 8-bit optimizers and quantization routines!! CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 7.0 CUDA SETUP: Detected CUDA version 102 /home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU! warn(msg) CUDA SETUP: Loading binary /home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda102_nocublaslt.so... 07/12/2023 16:16:51 - INFO - utils.common - Process rank: -1, device: cuda:0, n_gpu: 8 distributed training: False, 16-bits training: True 07/12/2023 16:16:51 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=8, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=False, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=None, evaluation_strategy=no, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=None, generation_num_beams=None, gradient_accumulation_steps=4, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=path_to_ppo_checkpoint/runs/Jul12_16-16-51_notebook-905609ac-c0e6-4275-af61-d7bde586ccb6, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=steps, lr_scheduler_type=cosine, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=1.0, optim=adamw_torch, optim_args=None, output_dir=path_to_ppo_checkpoint, overwrite_output_dir=False, past_index=-1, per_device_eval_batch_size=8, per_device_train_batch_size=2, predict_with_generate=False, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=path_to_ppo_checkpoint, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xpu_backend=None, ) 07/12/2023 16:16:51 - INFO - utils.common - Loading dataset alpaca_gpt4_data_en.json... 07/12/2023 16:18:11 - INFO - datasets.builder - Using custom data configuration default-f5c93679fc52bdba 07/12/2023 16:18:11 - INFO - datasets.info - Loading Dataset Infos from /home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/datasets/packaged_modules/json 07/12/2023 16:18:11 - INFO - datasets.builder - Generating dataset json (/home/ma-user/.cache/huggingface/datasets/json/default-f5c93679fc52bdba/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4) Downloading and preparing dataset json/default to /home/ma-user/.cache/huggingface/datasets/json/default-f5c93679fc52bdba/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4... Downloading data files: 100%|███████████████████| 1/1 [00:00<00:00, 7073.03it/s] 07/12/2023 16:18:11 - INFO - datasets.download.download_manager - Downloading took 0.0 min 07/12/2023 16:18:11 - INFO - datasets.download.download_manager - Checksum Computation took 0.0 min Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 176.61it/s] 07/12/2023 16:18:11 - INFO - datasets.builder - Generating train split 07/12/2023 16:18:12 - INFO - datasets.utils.info_utils - Unable to verify splits sizes. Dataset json downloaded and prepared to /home/ma-user/.cache/huggingface/datasets/json/default-f5c93679fc52bdba/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data. 100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 198.98it/s] [INFO|tokenization_utils_base.py:1807] 2023-07-12 16:18:12,530 >> loading file ice_text.model [INFO|tokenization_utils_base.py:1807] 2023-07-12 16:18:12,530 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1807] 2023-07-12 16:18:12,530 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1807] 2023-07-12 16:18:12,530 >> loading file tokenizer_config.json [INFO|configuration_utils.py:666] 2023-07-12 16:18:12,776 >> loading configuration file THUDM/chatglm-6b/config.json [INFO|configuration_utils.py:666] 2023-07-12 16:18:12,781 >> loading configuration file THUDM/chatglm-6b/config.json [INFO|configuration_utils.py:720] 2023-07-12 16:18:12,782 >> Model config ChatGLMConfig { "_name_or_path": "THUDM/chatglm-6b", "architectures": [ "ChatGLMModel" ], "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration" }, "bos_token_id": 130004, "eos_token_id": 130005, "gmask_token_id": 130001, "hidden_size": 4096, "inner_hidden_size": 16384, "layernorm_epsilon": 1e-05, "mask_token_id": 130000, "max_sequence_length": 2048, "model_type": "chatglm", "num_attention_heads": 32, "num_layers": 28, "pad_token_id": 3, "position_encoding_2d": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "torch_dtype": "float16", "transformers_version": "4.28.0", "use_cache": true, "vocab_size": 130528 }

[INFO|modeling_utils.py:2531] 2023-07-12 16:18:12,836 >> loading weights file THUDM/chatglm-6b/pytorch_model.bin.index.json [INFO|configuration_utils.py:575] 2023-07-12 16:18:12,838 >> Generate config GenerationConfig { "_from_model_config": true, "bos_token_id": 130004, "eos_token_id": 130005, "pad_token_id": 3, "transformers_version": "4.28.0" }

Loading checkpoint shards: 100%|██████████████████| 8/8 [00:13<00:00, 1.65s/it] [INFO|modeling_utils.py:3190] 2023-07-12 16:18:26,160 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:3199] 2023-07-12 16:18:26,160 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at THUDM/chatglm-6b. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|modeling_utils.py:2840] 2023-07-12 16:18:26,166 >> Generation config file not found, using a generation config created from the model config. 07/12/2023 16:18:26 - INFO - utils.common - Fine-tuning method: LoRA 07/12/2023 16:18:52 - INFO - utils.common - Loaded fine-tuned model from checkpoint(s): path_to_sft_checkpoint 07/12/2023 16:18:52 - INFO - utils.common - Load reward model from path_to_rm_checkpoint trainable params: 3674113 || all params: 6180630529 || trainable%: 0.0594 Running tokenizer on dataset: 0%| | 0/52002 [00:00<?, ? examples/s]07/12/2023 16:18:53 - INFO - datasets.arrow_dataset - Caching processed dataset at /home/ma-user/.cache/huggingface/datasets/json/default-f5c93679fc52bdba/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4/cache-5a2f6a9f1df99658.arrow input_ids: [53, 6945, 5, 9, 42, 4, 4, 64286, 12, 15150, 295, 4703, 108, 8555, 1849, 7, 4, 4, 67342, 12, 130001, 130004] inputs: [Round 1]

问:Give three tips for staying healthy.

答: 07/12/2023 16:20:03 - INFO - utils.ppo - Running training 07/12/2023 16:20:03 - INFO - utils.ppo - Num examples = 52002 07/12/2023 16:20:03 - INFO - utils.ppo - Num Epochs = 1.0 07/12/2023 16:20:03 - INFO - utils.ppo - Instantaneous batch size per device = 2 07/12/2023 16:20:03 - INFO - utils.ppo - Total train batch size (w. parallel, distributed & accumulation) = 8 07/12/2023 16:20:03 - INFO - utils.ppo - Gradient Accumulation steps = 4 07/12/2023 16:20:03 - INFO - utils.ppo - Total optimization steps = 6500 07/12/2023 16:20:03 - INFO - utils.ppo - Number of trainable parameters = 3674113 0%| | 0/6500 [00:00<?, ?it/s][INFO|configuration_utils.py:575] 2023-07-12 16:20:03,786 >> Generate config GenerationConfig { "_from_model_config": true, "bos_token_id": 130004, "eos_token_id": 130005, "pad_token_id": 3, "transformers_version": "4.28.0" }

0%| | 0/6500 [00:18<?, ?it/s] Traceback (most recent call last): File "/home/ma-user/work/zhanghongjun/ChatGLM-Efficient-Tuning-main/src/train_ppo.py", line 82, in main() File "/home/ma-user/work/zhanghongjun/ChatGLM-Efficient-Tuning-main/src/train_ppo.py", line 69, in main ppo_trainer.ppo_train(max_target_length=data_args.max_target_length) File "/home/ma-user/work/zhanghongjun/ChatGLM-Efficient-Tuning-main/src/utils/ppo.py", line 162, in ppo_train stats = self.step(queries, responses, rewards) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/contextlib.py", line 74, in inner return func(*args, *kwds) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/trl/trainer/ppo_trainer.py", line 548, in step batch["masks"], File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/contextlib.py", line 74, in inner return func(args, **kwds) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/trl/trainer/ppo_trainer.py", line 749, in train_minibatch loss_p, loss_v, train_stats = self.loss(old_logprobs, values, rewards, logits, vpreds, logprobs, mask) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/trl/trainer/ppo_trainer.py", line 856, in loss entropy = masked_mean(entropy_from_logits(logits), mask) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/trl/core.py", line 148, in entropy_from_logits pd = torch.nn.functional.softmax(logits, dim=-1) File "/home/ma-user/anaconda3/envs/chatglmeV2/lib/python3.7/site-packages/torch/nn/functional.py", line 1841, in softmax ret = input.softmax(dim) AttributeError: 'NoneType' object has no attribute 'softmax'

Process finished with exit code 1

hiyouga / ChatGLM-Efficient-Tuning

训练RLHF时报错， 'NoneType' object has no attribute 'softmax' #282

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues