Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?:
Log is here:
bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s]
Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s]
Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]
trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822
Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4...
Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s]
Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s]
Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s]
Run eval every 7857 steps
PyTorch: setting up devices
The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-).
Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using cuda_amp half precision backend
wandb: Tracking run with wandb version 0.15.4
wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing.
The following columns in the training set don't have a corresponding argument in PeftModelForCausalLM.forward and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected by PeftModelForCausalLM.forward, you can safely ignore this message.
Running training
Num examples = 62861
Num Epochs = 3
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 2
Gradient Accumulation steps = 2
Total optimization steps = 94290
0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza
wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune
trainer.train()
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train
return inner_training_loop(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss
outputs = model(inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward
transformer_outputs = self.transformer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward
outputs = block(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward
attn_outputs = self.self_attention(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward
fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size]
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected scalar type Half but found Char
Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?: Log is here: bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0 CUDA SETUP: Highest compute capability among GPUs detected: 7.5 CUDA SETUP: Detected CUDA version 118 CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so... Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s] Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s] Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s] Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s] Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s] Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]
Parameters: -------config------- dataset='sql-create-context-train.json' data_type='alpaca' lora_out_dir='./falcon-7b-alpaca/' lora_apply_dir=None weights='tiiuae/falcon-7b' target_modules=['query_key_value']
------training------ mbatch_size=1 batch_size=2 gradient_accumulation_steps=2 epochs=3 lr=0.0003 cutoff_len=256 lora_r=8 lora_alpha=16 lora_dropout=0.05 val_set_size=0.2 gradient_checkpointing=False gradient_checkpointing_ratio=1 warmup_steps=5 save_steps=50 save_total_limit=3 logging_steps=5 checkpoint=False skip=False world_size=1 ddp=False device_map='auto'
trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822 Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4... Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s] Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s] Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data. 100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s] Run eval every 7857 steps
sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')())
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main
args.func(args)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune
trainer.train()
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train
return inner_training_loop(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss
outputs = model(inputs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward
transformer_outputs = self.transformer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward
outputs = block(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward
attn_outputs = self.self_attention(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, *kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward
fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size]
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected scalar type Half but found Char
PyTorch: setting up devices The default value for the training argument
--report_to
will change in v5 (from all installed integrations to none). In v5, you will need to use--report_to all
to get the same behavior as now. You should start updating your code and make this info disappear :-). Using theWANDB_DISABLED
environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none). Using cuda_amp half precision backend wandb: Tracking run with wandb version 0.15.4 wandb: W&B syncing is set tooffline
in this directory.wandb: Run
wandb online
or set WANDB_MODE=online to enable cloud syncing. The following columns in the training set don't have a corresponding argument inPeftModelForCausalLM.forward
and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected byPeftModelForCausalLM.forward
, you can safely ignore this message. Running training Num examples = 62861 Num Epochs = 3 Instantaneous batch size per device = 1 Total train batch size (w. parallel, distributed & accumulation) = 2 Gradient Accumulation steps = 2 Total optimization steps = 94290 0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the__call__
method is faster than using a method to encode the text followed by a call to thepad
method to get a padded encoding. wandb: Waiting for W&B process to finish... (failed 1). wandb: You can sync this run to the cloud by running: wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in