rmihaylov / falcontune

Tune any FALCON in 4-bit
Apache License 2.0
468 stars 52 forks source link

RuntimeError: expected scalar type Half but found Char #27

Open cmazzoni87 opened 1 year ago

cmazzoni87 commented 1 year ago

Trying to finetune 7b model using sample on the documentation but getting this error, has anyone seen this before?: Log is here: bin /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so CUDA SETUP: CUDA runtime path found: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/libcudart.so.11.0 CUDA SETUP: Highest compute capability among GPUs detected: 7.5 CUDA SETUP: Detected CUDA version 118 CUDA SETUP: Loading binary /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so... Downloading (…)model.bin.index.json: 16.9kB [00:00, 57.6MB/s] Downloading (…)l-00001-of-00002.bin: 100%|██| 9.95G/9.95G [01:00<00:00, 166MB/s] Downloading (…)l-00002-of-00002.bin: 100%|██| 4.48G/4.48G [00:36<00:00, 123MB/s] Downloading (…)okenizer_config.json: 100%|█████| 220/220 [00:00<00:00, 1.93MB/s] Downloading (…)/main/tokenizer.json: 2.73MB [00:00, 17.2MB/s] Downloading (…)cial_tokens_map.json: 100%|█████| 281/281 [00:00<00:00, 2.00MB/s]

Parameters: -------config------- dataset='sql-create-context-train.json' data_type='alpaca' lora_out_dir='./falcon-7b-alpaca/' lora_apply_dir=None weights='tiiuae/falcon-7b' target_modules=['query_key_value']

------training------ mbatch_size=1 batch_size=2 gradient_accumulation_steps=2 epochs=3 lr=0.0003 cutoff_len=256 lora_r=8 lora_alpha=16 lora_dropout=0.05 val_set_size=0.2 gradient_checkpointing=False gradient_checkpointing_ratio=1 warmup_steps=5 save_steps=50 save_total_limit=3 logging_steps=5 checkpoint=False skip=False world_size=1 ddp=False device_map='auto'

trainable params: 2359296 || all params: 6924080000 || trainable%: 0.03407378308742822 Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4... Downloading data files: 100%|██████████████████| 1/1 [00:00<00:00, 10230.01it/s] Extracting data files: 100%|█████████████████████| 1/1 [00:00<00:00, 221.66it/s] Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-ad054cd2ceaeb665/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data. 100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 175.08it/s] Run eval every 7857 steps
PyTorch: setting up devices The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-). Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none). Using cuda_amp half precision backend wandb: Tracking run with wandb version 0.15.4 wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing. The following columns in the training set don't have a corresponding argument in PeftModelForCausalLM.forward and have been ignored: instruction, output, token_type_ids, input. If instruction, output, token_type_ids, input are not expected by PeftModelForCausalLM.forward, you can safely ignore this message. Running training Num examples = 62861 Num Epochs = 3 Instantaneous batch size per device = 1 Total train batch size (w. parallel, distributed & accumulation) = 2 Gradient Accumulation steps = 2 Total optimization steps = 94290 0%| | 0/94290 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding. wandb: Waiting for W&B process to finish... (failed 1). wandb: You can sync this run to the cloud by running: wandb: wandb sync /home/ec2-user/SageMaker/wandb/offline-run-20230621_200524-jdrmalza wandb: Find logs at: ./wandb/offline-run-20230621_200524-jdrmalza/logs Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/pytorch_p310/bin/falcontune", line 33, in sys.exit(load_entry_point('falcontune==0.1.0', 'console_scripts', 'falcontune')()) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/run.py", line 88, in main args.func(args) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/finetune.py", line 162, in finetune trainer.train() File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1521, in train return inner_training_loop( File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 1763, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2499, in training_step loss = self.compute_loss(model, inputs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/trainer.py", line 2531, in compute_loss outputs = model(inputs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward return self.base_model( File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 1070, in forward transformer_outputs = self.transformer( File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(args, kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 965, in forward outputs = block( File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 698, in forward attn_outputs = self.self_attention( File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/falcontune-0.1.0-py3.10.egg/falcontune/model/falcon/model.py", line 291, in forward fused_qkv = self.query_key_value(hidden_states) # [batch_size, seq_length, 3 x hidden_size] File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: expected scalar type Half but found Char