[2023-12-31 13:53:08,703] [WARNING] [axolotl.validate_config:250] [PID:17802] [RANK:0] trust_remote_code is set to true. Please make sure that you reviewed the remote code/model.
[2023-12-31 13:53:09,764] [INFO] [axolotl.normalize_config:150] [PID:17802] [RANK:0] GPU memory usage baseline: 0.000GB (+0.886GB misc)
[2023-12-31 13:53:09,765] [INFO] [axolotl.common.cli.load_model_and_tokenizer:49] [PID:17802] [RANK:0] loading tokenizer... microsoft/phi-1_5
[2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:185] [PID:17802] [RANK:0] EOS: 50256 / <|endoftext|>
[2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:186] [PID:17802] [RANK:0] BOS: 50256 / <|endoftext|>
[2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:187] [PID:17802] [RANK:0] PAD: 50256 / <|endoftext|>
[2023-12-31 13:53:10,105] [DEBUG] [axolotl.load_tokenizer:188] [PID:17802] [RANK:0] UNK: 50256 / <|endoftext|>
[2023-12-31 13:53:10,105] [INFO] [axolotl.load_tokenizer:193] [PID:17802] [RANK:0] No Chat template selected. Consider adding a chat template for easier inference.
[2023-12-31 13:53:10,105] [INFO] [axolotl.common.cli.load_model_and_tokenizer:51] [PID:17802] [RANK:0] loading model and (optionally) peft_config...
[2023-12-31 13:53:15,476] [INFO] [axolotl.load_model:517] [PID:17802] [RANK:0] GPU memory usage after model load: 2.642GB (+0.048GB cache, +1.321GB misc)
Give me an instruction (Ctrl + D to submit):
what's your name?
what's your Traceback (most recent call last):
File "/root/anaconda3/envs/axolotl/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/anaconda3/envs/axolotl/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/axolotl/src/axolotl/cli/inference.py", line 36, in
fire.Fire(do_cli)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, kwargs)
File "/root/axolotl/src/axolotl/cli/inference.py", line 32, in do_cli
do_inference(cfg=parsed_cfg, cli_args=parsed_cli_args)
File "/root/axolotl/src/axolotl/cli/init.py", line 142, in do_inference
generated = model.generate(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/transformers/generation/utils.py", line 1764, in generate
return self.sample(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/transformers/generation/utils.py", line 2861, in sample
outputs = self(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 1048, in forward
hidden_states = self.transformer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 997, in forward
hidden_states = layer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 844, in forward
attn_outputs = self.mixer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 794, in forward
attn_output = self._forward_cross_attn(
TypeError: _forward_cross_attn() got an unexpected keyword argument 'cu_seqlens'
Steps to reproduce
Train a full phi fine tune based on a local dataset, no changes to config apart from eval_sample_packing: false
Please check that this issue hasn't been reported before.
Expected Behavior
Inference should work out of the box after a full fine tune
Current behaviour
(axolotl) root@Transformers:~/axolotl# python -m axolotl.cli.inference examples/phi/phi-ft.yml --lora-model-dir="./phi-sft-out" /root/anaconda3/envs/axolotl/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( [2023-12-31 13:53:07,441] [INFO] [datasets.:58] [PID:17802] PyTorch version 2.0.1 available.
dP dP dP
88 88 88
.d8888b. dP. .dP .d8888b. 88 .d8888b. d8888P 88
88'
88
8bd8' 88'88 88 88'
88 88 8888888P8 dP'
dP88888P' dP
88888P' dP dP[2023-12-31 13:53:08,703] [WARNING] [axolotl.validate_config:250] [PID:17802] [RANK:0]
trust_remote_code
is set to true. Please make sure that you reviewed the remote code/model. [2023-12-31 13:53:09,764] [INFO] [axolotl.normalize_config:150] [PID:17802] [RANK:0] GPU memory usage baseline: 0.000GB (+0.886GB misc) [2023-12-31 13:53:09,765] [INFO] [axolotl.common.cli.load_model_and_tokenizer:49] [PID:17802] [RANK:0] loading tokenizer... microsoft/phi-1_5 [2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:185] [PID:17802] [RANK:0] EOS: 50256 / <|endoftext|> [2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:186] [PID:17802] [RANK:0] BOS: 50256 / <|endoftext|> [2023-12-31 13:53:10,104] [DEBUG] [axolotl.load_tokenizer:187] [PID:17802] [RANK:0] PAD: 50256 / <|endoftext|> [2023-12-31 13:53:10,105] [DEBUG] [axolotl.load_tokenizer:188] [PID:17802] [RANK:0] UNK: 50256 / <|endoftext|> [2023-12-31 13:53:10,105] [INFO] [axolotl.load_tokenizer:193] [PID:17802] [RANK:0] No Chat template selected. Consider adding a chat template for easier inference. [2023-12-31 13:53:10,105] [INFO] [axolotl.common.cli.load_model_and_tokenizer:51] [PID:17802] [RANK:0] loading model and (optionally) peft_config... [2023-12-31 13:53:15,476] [INFO] [axolotl.load_model:517] [PID:17802] [RANK:0] GPU memory usage after model load: 2.642GB (+0.048GB cache, +1.321GB misc)Give me an instruction (Ctrl + D to submit): what's your name?
what's your Traceback (most recent call last): File "/root/anaconda3/envs/axolotl/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/anaconda3/envs/axolotl/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/axolotl/src/axolotl/cli/inference.py", line 36, in
fire.Fire(do_cli)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, kwargs)
File "/root/axolotl/src/axolotl/cli/inference.py", line 32, in do_cli
do_inference(cfg=parsed_cfg, cli_args=parsed_cli_args)
File "/root/axolotl/src/axolotl/cli/init.py", line 142, in do_inference
generated = model.generate(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/transformers/generation/utils.py", line 1764, in generate
return self.sample(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/transformers/generation/utils.py", line 2861, in sample
outputs = self(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 1048, in forward
hidden_states = self.transformer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 997, in forward
hidden_states = layer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 844, in forward
attn_outputs = self.mixer(
File "/root/anaconda3/envs/axolotl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/root/axolotl/src/axolotl/models/phi/modeling_phi.py", line 794, in forward
attn_output = self._forward_cross_attn(
TypeError: _forward_cross_attn() got an unexpected keyword argument 'cu_seqlens'
Steps to reproduce
Config yaml
base_model: microsoft/phi-1_5 model_type: PhiForCausalLM tokenizer_type: AutoTokenizer is_llama_derived_model: false trust_remote_code: true
load_in_8bit: false load_in_4bit: false strict: false
datasets:
dataset_prepared_path: val_set_size: 0.05 output_dir: ./phi-sft-out
sequence_len: 2048 sample_packing: false pad_to_sequence_len: true eval_sample_packing: false
adapter: lora_model_dir: lora_r: lora_alpha: lora_dropout: lora_target_linear: lora_fan_in_fan_out:
wandb_project: wandb_entity: wandb_watch: wandb_name: wandb_log_model:
gradient_accumulation_steps: 1 micro_batch_size: 1 num_epochs: 4 optimizer: adamw_torch adam_beta2: 0.95 adam_epsilon: 0.00001 max_grad_norm: 1.0 lr_scheduler: cosine learning_rate: 0.000003
train_on_inputs: false group_by_length: true bf16: true fp16: false tf32: true
gradient_checkpointing: early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: false
warmup_steps: 100 evals_per_epoch: 4 saves_per_epoch: 1 debug: deepspeed: weight_decay: 0.1 fsdp: fsdp_config: resize_token_embeddings_to_32x: true special_tokens: bos_token: "<|endoftext|>" eos_token: "<|endoftext|>" unk_token: "<|endoftext|>" pad_token: "<|endoftext|>"
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.9
axolotl branch-commit
main
Acknowledgements