RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != float
Is there an existing issue for this?
[X] I have searched the existing issues
Reproduction
load tinydolphin in 8 bit, try to make lora
Screenshot
No response
Logs
Traceback (most recent call last):
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\utils\hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\huggingface_hub\utils\_validators.py", line 106, in _inner_fn
validate_repo_id(arg_value)
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\huggingface_hub\utils\_validators.py", line 160, in validate_repo_id
raise HFValidationError(
huggingface_hub.errors.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'models\None'.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\AI\text-generation-webui-main\modules\training.py", line 508, in do_train
reload_model()
File "D:\AI\text-generation-webui-main\modules\models.py", line 439, in reload_model
shared.model, shared.tokenizer = load_model(shared.model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\modules\models.py", line 94, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\modules\models.py", line 149, in huggingface_loader
config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=shared.args.trust_remote_code)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\models\auto\configuration_auto.py", line 928, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\configuration_utils.py", line 631, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\configuration_utils.py", line 686, in _get_config_dict
resolved_config_file = cached_file(
^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\utils\hub.py", line 462, in cached_file
raise EnvironmentError(
OSError: Incorrect path_or_model_id: 'models\None'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
18:39:01-714429 INFO Loading "cognitivecomputations_TinyDolphin-2.8-1.1b"
18:39:01-718433 INFO TRANSFORMERS_PARAMS=
{ 'low_cpu_mem_usage': True,
'torch_dtype': torch.float16,
'device_map': 'auto',
'quantization_config': BitsAndBytesConfig {
"_load_in_4bit": false,
"_load_in_8bit": true,
"bnb_4bit_compute_dtype": "float32",
"bnb_4bit_quant_storage": "uint8",
"bnb_4bit_quant_type": "fp4",
"bnb_4bit_use_double_quant": false,
"llm_int8_enable_fp32_cpu_offload": false,
"llm_int8_has_fp16_weight": false,
"llm_int8_skip_modules": null,
"llm_int8_threshold": 6.0,
"load_in_4bit": false,
"load_in_8bit": true,
"quant_method": "bitsandbytes"
}
}
18:39:04-583136 INFO Loaded "cognitivecomputations_TinyDolphin-2.8-1.1b" in 2.87 seconds.
18:39:04-585136 INFO LOADER: "Transformers"
18:39:04-586135 INFO TRUNCATION LENGTH: 4096
18:39:04-586642 INFO INSTRUCTION TEMPLATE: "Alpaca"
18:39:05-418538 INFO Loading raw text file dataset
18:40:16-754125 INFO Getting model ready
18:40:16-765123 INFO Preparing for training
18:40:16-768121 INFO Creating LoRA model
18:40:17-236766 INFO Starting training
Training 'llama' model using (q, v, k) projections
Trainable params: 24,510,464 (2.1795 %), All params: 1,124,567,040 (Model: 1,100,056,576)
Monitoring loss (Auto-Stop at: 1.8)
18:40:17-260766 INFO Log file 'train_dataset_sample.json' created in the 'logs' directory.
Exception in thread Thread-17 (threaded_run):
Traceback (most recent call last):
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\threading.py", line 1045, in _bootstrap_inner
self.run()
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "D:\AI\text-generation-webui-main\modules\training.py", line 705, in threaded_run
trainer.train()
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\trainer.py", line 1859, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\trainer.py", line 2203, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\transformers\trainer.py", line 3147, in training_step
self.accelerator.backward(loss)
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\accelerate\accelerator.py", line 1964, in backward
self.scaler.scale(loss).backward(**kwargs)
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\_tensor.py", line 522, in backward
torch.autograd.backward(
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\autograd\__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\autograd\function.py", line 289, in apply
return user_fn(self, *args)
^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\utils\checkpoint.py", line 319, in backward
torch.autograd.backward(outputs_with_grad, args_with_grad)
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\autograd\__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\autograd\function.py", line 289, in apply
return user_fn(self, *args)
^^^^^^^^^^^^^^^^^^^^
File "D:\AI\text-generation-webui-main\installer_files\env\Lib\site-packages\bitsandbytes\autograd\_functions.py", line 474, in backward
grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != float
18:40:22-747869 INFO Training complete, saving
18:40:22-882405 INFO Training complete!
Describe the bug
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != float
Is there an existing issue for this?
Reproduction
load tinydolphin in 8 bit, try to make lora
Screenshot
No response
Logs
System Info