mymusise / ChatGLM-Tuning

基于ChatGLM-6B + LoRA的Fintune方案
MIT License
3.71k stars 444 forks source link

expected scalar type Half but found Float #179

Closed SeekPoint closed 1 year ago

SeekPoint commented 1 year ago

(gh_ChatGLM-Tuning) ub2004@ub2004-B85M-A0:~/llm_dev/ChatGLM-Tuning$ python3 finetune.py --dataset_path data/alpaca --lora_rank 2 --per_device_train_batch_size 2 --gradient_accumulation_steps 1 --max_steps 500 --save_steps 100 --save_total_limit 2 --learning_rate 1e-4 --fp16 --remove_unused_columns false --logging_steps 50 --output_dir output /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.15) or chardet (3.0.4) doesn't match a supported version! warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/ub2004/anaconda3/envs/gh_ChatGLM-Tuning/lib')} warn(msg) /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/ub2004/anaconda3/envs/gh_ChatGLM-Tuning did not contain libcudart.so as expected! Searching further paths... warn(msg) /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('@/tmp/.ICE-unix/1648,unix/ub2004-B85M-A0'), PosixPath('local/ub2004-B85M-A0')} warn(msg) /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/xdg/xdg-ubuntu')} warn(msg) /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('0'), PosixPath('1')} warn(msg) /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/gnome/Terminal/screen/00a04b8e_2929_4d34_a713_fca57864faa5')} warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 6.1 CUDA SETUP: Detected CUDA version 117 /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU! warn(msg) CUDA SETUP: Loading binary /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so... Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Overriding torch_dtype=None with torch_dtype=torch.float16 due to requirements of bitsandbytes to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:46<00:00, 5.81s/it]

len(dataset)=49917

You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is :DefaultFlowCallback TensorBoardCallback WandbCallback /home/ub2004/.local/lib/python3.8/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning warnings.warn( /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.15) or chardet (3.0.4) doesn't match a supported version! warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported " wandb: (1) Create a W&B account wandb: (2) Use an existing W&B account wandb: (3) Don't visualize my results wandb: Enter your choice: 3 wandb: You chose "Don't visualize my results" wandb: Tracking run with wandb version 0.14.2 wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing. 0%| | 0/500 [00:00<?, ?it/s]/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py:298: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization") Traceback (most recent call last): File "finetune.py", line 118, in main() File "finetune.py", line 111, in main trainer.train() File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1633, in train return inner_training_loop( File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1902, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2655, in training_step self.scaler.scale(loss).backward() File "/home/ub2004/.local/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward torch.autograd.backward( File "/home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/function.py", line 267, in apply return user_fn(self, args) File "/home/ub2004/.local/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 157, in backward torch.autograd.backward(outputs_with_grad, args_with_grad) File "/home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/function.py", line 267, in apply return user_fn(self, args) File "/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 456, in backward grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A) RuntimeError: expected scalar type Half but found Float ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/ub2004/llm_dev/ChatGLM-Tuning/finetune.py:118 in │ │ │ │ 115 │ │ 116 │ │ 117 if name == "main": │ │ ❱ 118 │ main() │ │ 119 │ │ │ │ /home/ub2004/llm_dev/ChatGLM-Tuning/finetune.py:111 in main │ │ │ │ 108 │ │ callbacks=[TensorBoardCallback(writer)], │ │ 109 │ │ data_collator=data_collator, │ │ 110 │ ) │ │ ❱ 111 │ trainer.train() │ │ 112 │ writer.close() │ │ 113 │ # save model │ │ 114 │ model.save_pretrained(training_args.output_dir) │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py:1633 in train │ │ │ │ 1630 │ │ inner_training_loop = find_executable_batch_size( │ │ 1631 │ │ │ self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size │ │ 1632 │ │ ) │ │ ❱ 1633 │ │ return inner_training_loop( │ │ 1634 │ │ │ args=args, │ │ 1635 │ │ │ resume_from_checkpoint=resume_from_checkpoint, │ │ 1636 │ │ │ trial=trial, │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py:1902 in │ │ _inner_training_loop │ │ │ │ 1899 │ │ │ │ │ with model.no_sync(): │ │ 1900 │ │ │ │ │ │ tr_loss_step = self.training_step(model, inputs) │ │ 1901 │ │ │ │ else: │ │ ❱ 1902 │ │ │ │ │ tr_loss_step = self.training_step(model, inputs) │ │ 1903 │ │ │ │ │ │ 1904 │ │ │ │ if ( │ │ 1905 │ │ │ │ │ args.logging_nan_inf_filter │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/transformers/trainer.py:2655 in training_step │ │ │ │ 2652 │ │ │ loss = loss / self.args.gradient_accumulation_steps │ │ 2653 │ │ │ │ 2654 │ │ if self.do_grad_scaling: │ │ ❱ 2655 │ │ │ self.scaler.scale(loss).backward() │ │ 2656 │ │ elif self.use_apex: │ │ 2657 │ │ │ with amp.scale_loss(loss, self.optimizer) as scaled_loss: │ │ 2658 │ │ │ │ scaled_loss.backward() │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/_tensor.py:488 in backward │ │ │ │ 485 │ │ │ │ create_graph=create_graph, │ │ 486 │ │ │ │ inputs=inputs, │ │ 487 │ │ │ ) │ │ ❱ 488 │ │ torch.autograd.backward( │ │ 489 │ │ │ self, gradient, retain_graph, create_graph, inputs=inputs │ │ 490 │ │ ) │ │ 491 │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/init.py:197 in backward │ │ │ │ 194 │ # The reason we repeat same the comment below is that │ │ 195 │ # some Python versions print out the first line of a multi-line function │ │ 196 │ # calls in the traceback and some print out the last line │ │ ❱ 197 │ Variable._execution_engine.run_backward( # Calls into the C++ engine to run the bac │ │ 198 │ │ tensors, gradtensors, retain_graph, create_graph, inputs, │ │ 199 │ │ allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to ru │ │ 200 │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/function.py:267 in apply │ │ │ │ 264 │ │ │ │ │ │ │ "Function is not allowed. You should only implement one " │ │ 265 │ │ │ │ │ │ │ "of them.") │ │ 266 │ │ user_fn = vjp_fn if vjp_fn is not Function.vjp else backward_fn │ │ ❱ 267 │ │ return user_fn(self, args) │ │ 268 │ │ │ 269 │ def apply_jvp(self, args): │ │ 270 │ │ # _forward_cls is defined by derived class │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/utils/checkpoint.py:157 in backward │ │ │ │ 154 │ │ │ raise RuntimeError( │ │ 155 │ │ │ │ "none of output has requires_grad=True," │ │ 156 │ │ │ │ " this checkpoint() is not necessary") │ │ ❱ 157 │ │ torch.autograd.backward(outputs_with_grad, args_with_grad) │ │ 158 │ │ grads = tuple(inp.grad if isinstance(inp, torch.Tensor) else None │ │ 159 │ │ │ │ │ for inp in detached_inputs) │ │ 160 │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/init.py:197 in backward │ │ │ │ 194 │ # The reason we repeat same the comment below is that │ │ 195 │ # some Python versions print out the first line of a multi-line function │ │ 196 │ # calls in the traceback and some print out the last line │ │ ❱ 197 │ Variable._execution_engine.run_backward( # Calls into the C++ engine to run the bac │ │ 198 │ │ tensors, gradtensors, retain_graph, create_graph, inputs, │ │ 199 │ │ allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to ru │ │ 200 │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/torch/autograd/function.py:267 in apply │ │ │ │ 264 │ │ │ │ │ │ │ "Function is not allowed. You should only implement one " │ │ 265 │ │ │ │ │ │ │ "of them.") │ │ 266 │ │ user_fn = vjp_fn if vjp_fn is not Function.vjp else backward_fn │ │ ❱ 267 │ │ return user_fn(self, args) │ │ 268 │ │ │ 269 │ def apply_jvp(self, args): │ │ 270 │ │ # _forward_cls is defined by derived class │ │ │ │ /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py:456 in │ │ backward │ │ │ │ 453 │ │ │ │ │ 454 │ │ │ elif state.CB is not None: │ │ 455 │ │ │ │ CB = state.CB.to(ctx.dtypeA, copy=True).mul(state.SCB.unsqueeze(1).mul │ │ ❱ 456 │ │ │ │ grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype │ │ 457 │ │ │ elif state.CxB is not None: │ │ 458 │ │ │ │ │ │ 459 │ │ │ │ if state.tile_indices is None: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: expected scalar type Half but found Float wandb: Waiting for W&B process to finish... (failed 1). wandb: You can sync this run to the cloud by running:

suc16 commented 1 year ago

试试 model = AutoModel.from_pretrained("THUDM/chatglm-6b").half().cuda()

SeekPoint commented 1 year ago

试试 model = AutoModel.from_pretrained("THUDM/chatglm-6b").half().cuda()

it works with trust_remote_code=True!

Thompson-Chen commented 1 year ago

试试 model = AutoModel.from_pretrained("THUDM/chatglm-6b").half().cuda()

it solves my problems! Thanks!

Thompson-Chen commented 1 year ago

已收到。谢谢!