Closed mathiasesn closed 9 months ago
The training should begin.
accelerate launch -m axolotl.cli.train examples/mistral/qlora.yml The following values were not passed to `accelerate launch` and had defaults used instead: `--num_processes` was set to a value of `1` `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. 2023-11-02 16:09:38.417662: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-11-02 16:09:38.484621: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2023-11-02 16:09:38.487148: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-11-02 16:09:38.487158: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2023-11-02 16:09:38.500337: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-11-02 16:09:38.781474: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-11-02 16:09:38.781510: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-11-02 16:09:38.781514: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. /home/mathias/.local/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( dP dP dP 88 88 88 .d8888b. dP. .dP .d8888b. 88 .d8888b. d8888P 88 88' `88 `8bd8' 88' `88 88 88' `88 88 88 88. .88 .d88b. 88. .88 88 88. .88 88 88 `88888P8 dP' `dP `88888P' dP `88888P' dP dP [2023-11-02 16:09:39,917] [WARNING] [axolotl.validate_config:169] [PID:105218] [RANK:0] eval_batch_size != micro_batch_size. This can lead to VRAM instability. [2023-11-02 16:09:40,100] [INFO] [axolotl.normalize_config:128] [PID:105218] [RANK:0] GPU memory usage baseline: 0.000GB (+18.426GB misc) [2023-11-02 16:09:40,100] [WARNING] [axolotl.scripts.check_user_token:268] [PID:105218] [RANK:0] Error verifying HuggingFace token. Remember to log in using `huggingface-cli login` and get your access token from https://huggingface.co/settings/tokens if you want to use gated models or datasets. [2023-11-02 16:09:40,287] [DEBUG] [axolotl.load_tokenizer:96] [PID:105218] [RANK:0] EOS: 2 / </s> [2023-11-02 16:09:40,287] [DEBUG] [axolotl.load_tokenizer:97] [PID:105218] [RANK:0] BOS: 1 / <s> [2023-11-02 16:09:40,287] [DEBUG] [axolotl.load_tokenizer:98] [PID:105218] [RANK:0] PAD: 2 / </s> [2023-11-02 16:09:40,287] [DEBUG] [axolotl.load_tokenizer:99] [PID:105218] [RANK:0] UNK: 0 / <unk> [2023-11-02 16:09:40,287] [INFO] [axolotl.load_tokenized_prepared_datasets:133] [PID:105218] [RANK:0] Unable to find prepared dataset in last_run_prepared/79fe5144e8e385dc65045e15b51b2838 [2023-11-02 16:09:40,287] [INFO] [axolotl.load_tokenized_prepared_datasets:134] [PID:105218] [RANK:0] Loading raw datasets... [2023-11-02 16:09:40,287] [INFO] [axolotl.load_tokenized_prepared_datasets:139] [PID:105218] [RANK:0] No seed provided, using default seed of 42 Map (num_proc=24): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:00<00:00, 6047.53 examples/s] [2023-11-02 16:09:45,058] [INFO] [axolotl.load_tokenized_prepared_datasets:281] [PID:105218] [RANK:0] merging datasets [2023-11-02 16:09:45,060] [INFO] [axolotl.load_tokenized_prepared_datasets:288] [PID:105218] [RANK:0] Saving merged prepared dataset to disk... last_run_prepared/79fe5144e8e385dc65045e15b51b2838 Saving the dataset (1/1 shards): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:00<00:00, 285404.46 examples/s] Filter (num_proc=24): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1980/1980 [00:00<00:00, 15294.03 examples/s] Filter (num_proc=20): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 189.24 examples/s] Map (num_proc=24): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1980/1980 [00:00<00:00, 13560.94 examples/s] Map (num_proc=20): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 164.15 examples/s] [2023-11-02 16:09:46,088] [INFO] [axolotl.calculate_total_num_steps:156] [PID:105218] [RANK:0] calculating total_num_tokens [2023-11-02 16:09:46,090] [INFO] [axolotl.calculate_total_num_steps:163] [PID:105218] [RANK:0] total_num_tokens: 426849 [2023-11-02 16:09:46,098] [INFO] [axolotl.calculate_total_num_steps:173] [PID:105218] [RANK:0] `total_supervised_tokens: 294561` [2023-11-02 16:09:46,100] [INFO] [axolotl.utils.dataloader.generate_batches:225] [PID:105218] [RANK:0] generating packed batches [2023-11-02 16:09:46,101] [INFO] [axolotl.utils.dataloader.generate_batches:231] [PID:105218] [RANK:0] 04eb73112c686fd33f79315115335175d7e6f9ed53cb34af6f8ff4b46d340184 [2023-11-02 16:09:48,416] [INFO] [axolotl.utils.dataloader.len_w_stats:335] [PID:105218] [RANK:0] packing_efficiency_estimate: 1.0 actual packing efficiency: 0.9649183485243056 [2023-11-02 16:09:48,416] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 1.0 total_num_tokens per device: 426849 [2023-11-02 16:09:48,416] [INFO] [axolotl.calculate_total_num_steps:223] [PID:105218] [RANK:0] data_loader_len: 24 [2023-11-02 16:09:48,416] [INFO] [axolotl.calc_sample_packing_eff_est:229] [PID:105218] [RANK:0] sample_packing_eff_est across ranks: [0.9649183485243056] [2023-11-02 16:09:48,416] [INFO] [axolotl.calculate_total_num_steps:240] [PID:105218] [RANK:0] sample_packing_eff_est: 0.97 [2023-11-02 16:09:48,416] [INFO] [axolotl.calculate_total_num_steps:245] [PID:105218] [RANK:0] total_num_steps: 24 [2023-11-02 16:09:48,419] [INFO] [axolotl.train.train:47] [PID:105218] [RANK:0] loading tokenizer... mistralai/Mistral-7B-v0.1 [2023-11-02 16:09:48,606] [DEBUG] [axolotl.load_tokenizer:96] [PID:105218] [RANK:0] EOS: 2 / </s> [2023-11-02 16:09:48,606] [DEBUG] [axolotl.load_tokenizer:97] [PID:105218] [RANK:0] BOS: 1 / <s> [2023-11-02 16:09:48,606] [DEBUG] [axolotl.load_tokenizer:98] [PID:105218] [RANK:0] PAD: 2 / </s> [2023-11-02 16:09:48,606] [DEBUG] [axolotl.load_tokenizer:99] [PID:105218] [RANK:0] UNK: 0 / <unk> [2023-11-02 16:09:48,606] [INFO] [axolotl.train.train:55] [PID:105218] [RANK:0] loading model and (optionally) peft_config... [2023-11-02 16:09:48,719] [INFO] [axolotl.load_model:180] [PID:105218] [RANK:0] patching with flash attention Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00, 5.78s/it] [2023-11-02 16:10:01,630] [INFO] [axolotl.load_model:404] [PID:105218] [RANK:0] GPU memory usage after model load: 4.349GB (+0.154GB cache, +18.718GB misc) [2023-11-02 16:10:01,645] [INFO] [axolotl.load_model:421] [PID:105218] [RANK:0] converting PEFT model w/ prepare_model_for_kbit_training [2023-11-02 16:10:01,646] [INFO] [axolotl.load_model:432] [PID:105218] [RANK:0] converting modules to torch.bfloat16 for flash attention [2023-11-02 16:10:01,648] [INFO] [axolotl.load_lora:541] [PID:105218] [RANK:0] found linear modules: ['o_proj', 'q_proj', 'down_proj', 'gate_proj', 'v_proj', 'up_proj', 'k_proj'] trainable params: 83,886,080 || all params: 7,325,618,176 || trainable%: 1.1451058188485088 [2023-11-02 16:10:22,854] [INFO] [axolotl.load_model:468] [PID:105218] [RANK:0] GPU memory usage after adapters: 4.679GB (+0.218GB cache, +18.718GB misc) [2023-11-02 16:10:22,858] [INFO] [axolotl.train.train:83] [PID:105218] [RANK:0] Pre-saving adapter config to ./qlora-out [2023-11-02 16:10:22,859] [INFO] [axolotl.train.train:107] [PID:105218] [RANK:0] Starting trainer... [2023-11-02 16:10:22,978] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 total_num_tokens per device: 426849 [2023-11-02 16:10:22,978] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 total_num_tokens per device: 426849 0%| | 0/6 [00:00<?, ?it/s][2023-11-02 16:10:23,003] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 total_num_tokens per device: 426849 [2023-11-02 16:10:23,004] [INFO] [axolotl.utils.dataloader.generate_batches:225] [PID:105218] [RANK:0] generating packed batches [2023-11-02 16:10:23,004] [INFO] [axolotl.utils.dataloader.generate_batches:231] [PID:105218] [RANK:0] 95be00c870cb4642e0ccbd683d94d7f48db802945935b4c592d35b9325f3cb70 [2023-11-02 16:10:23,005] [INFO] [axolotl.utils.dataloader.len_w_stats:335] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 actual packing efficiency: 0.9649183485243056 [2023-11-02 16:10:23,005] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 total_num_tokens per device: 426849 [2023-11-02 16:10:23,005] [INFO] [axolotl.utils.dataloader._worker:192] [PID:105218] [RANK:0] [WORKER] Epochs: 1, Samples: 50 [2023-11-02 16:10:23,005] [INFO] [axolotl.utils.dataloader.generate_batches:225] [PID:105218] [RANK:0] generating packed batches [2023-11-02 16:10:23,005] [INFO] [axolotl.utils.dataloader.generate_batches:231] [PID:105218] [RANK:0] 108564450b4c7e49268b3732f8d4d1d60e9c4d2dd52d1dcb3b4d078e1632f9ba [2023-11-02 16:10:23,006] [INFO] [axolotl.utils.dataloader._len_est:304] [PID:105218] [RANK:0] packing_efficiency_estimate: 0.97 total_num_tokens per device: 426849 Traceback (most recent call last): File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 164, in new_forward output = module._old_forward(*args, **kwargs) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 164, in new_forward output = module._old_forward(*args, **kwargs) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 164, in new_forward output = module._old_forward(*args, **kwargs) [Previous line repeated 988 more times] File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 159, in new_forward args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 290, in pre_forward return send_to_device(args, self.execution_device), send_to_device( File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 151, in send_to_device return honor_type( File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 83, in honor_type return type(obj)(generator) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 152, in <genexpr> tensor, (send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys) for t in tensor) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 154, in send_to_device elif isinstance(tensor, Mapping): File "/usr/lib/python3.10/typing.py", line 994, in __instancecheck__ return self.__subclasscheck__(type(obj)) File "/usr/lib/python3.10/typing.py", line 1158, in __subclasscheck__ return issubclass(cls, self.__origin__) File "/usr/lib/python3.10/abc.py", line 123, in __subclasscheck__ return _abc_subclasscheck(cls, subclass) RecursionError: maximum recursion depth exceeded in comparison 0%| | 0/6 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/mathias/.local/bin/accelerate", line 8, in <module> sys.exit(main()) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 994, in launch_command simple_launcher(args) File "/home/mathias/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 636, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', 'axolotl.cli.train', 'examples/mistral/qlora.yml']' returned non-zero exit status 1.
noisy_embedding_alpha: 5
examples/mistral/qlora.yml
accelerate launch -m axolotl.cli.train examples/mistral/qlora.yml
base_model: mistralai/Mistral-7B-v0.1 model_type: MistralForCausalLM tokenizer_type: LlamaTokenizer is_mistral_derived_model: true
load_in_8bit: false load_in_4bit: true strict: false
datasets:
adapter: qlora lora_model_dir:
sequence_len: 8192 sample_packing: true pad_to_sequence_len: true
lora_r: 32 lora_alpha: 16 lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: lora_target_modules:
wandb_project: wandb_entity: wandb_watch: wandb_run_id: wandb_log_model:
gradient_accumulation_steps: 4 micro_batch_size: 2 num_epochs: 1 optimizer: adamw_bnb_8bit lr_scheduler: cosine learning_rate: 0.0002
train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: false
gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true
warmup_steps: 10 eval_steps: 0.05 eval_table_size: eval_table_max_new_tokens: 128 save_steps: debug: deepspeed: weight_decay: 0.0 fsdp: fsdp_config: special_tokens: bos_token: "" eos_token: "" unk_token: ""
No response
3.10
main
same error
hmm, we might have to change the implementation to use HF's native NEFT.
Please check that this issue hasn't been reported before.
Expected Behavior
The training should begin.
Current behaviour
Steps to reproduce
noisy_embedding_alpha: 5
toexamples/mistral/qlora.yml
.accelerate launch -m axolotl.cli.train examples/mistral/qlora.yml
.Config yaml
base_model: mistralai/Mistral-7B-v0.1 model_type: MistralForCausalLM tokenizer_type: LlamaTokenizer is_mistral_derived_model: true
load_in_8bit: false load_in_4bit: true strict: false
datasets:
adapter: qlora lora_model_dir:
sequence_len: 8192 sample_packing: true pad_to_sequence_len: true
lora_r: 32 lora_alpha: 16 lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: lora_target_modules:
wandb_project: wandb_entity: wandb_watch: wandb_run_id: wandb_log_model:
gradient_accumulation_steps: 4 micro_batch_size: 2 num_epochs: 1 optimizer: adamw_bnb_8bit lr_scheduler: cosine learning_rate: 0.0002
train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: false
gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true
warmup_steps: 10 eval_steps: 0.05 eval_table_size: eval_table_max_new_tokens: 128 save_steps: debug: deepspeed: weight_decay: 0.0 fsdp: fsdp_config: special_tokens: bos_token: ""
" eos_token: "" unk_token: "noisy_embedding_alpha: 5
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
main
Acknowledgements