File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'.

On colab, I am trying to fine-tune llama on a personal set of prompts starting from the Italian adapter weights stambecco (https://huggingface.co/mchl-labs/stambecco-7b-plus).

To do so, I set --resume_from_checkpoint='./teelinsan/camoscio-7b-llama', after downloading the adapter_config.bin and adapter_model.bin in the related directory:

!cd /content/alpaca-lora/ && python finetune.py \ --base_model='decapoda-research/llama-7b-hf' \ --data_path='/content/drive/MyDrive/Colab Data/data.json' \ --output_dir='/content/drive/MyDrive/Colab Data/model' \ --num_epochs=10 \ --cutoff_len=512 \ --batch_size 128 \ --micro_batch_size 4 \ --num_epochs 10 \ --learning_rate 3e-4 \ --val_set_size 250 \ --lora_r 8 \ --lora_alpha 16 \ --lora_dropout 0.05 \ --lora_target_modules '[q_proj,v_proj]' \ --train_on_inputs \ --group_by_length #\ --resume_from_checkpoint='./mchl-labs/stambecco-7b-plus'

Execution produces the following error:

Loading checkpoint shards: 100% 2/2 [01:08<00:00, 34.29s/it] You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that thelegacy(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 /usr/local/lib/python3.10/dist-packages/peft/utils/other.py:133: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead. warnings.warn( Restarting from ./teelinsan/camoscio-7b-llama/adapter_model.bin Traceback (most recent call last): File "/content/alpaca-lora/finetune.py", line 283, in <module> fire.Fire(train) File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/content/alpaca-lora/finetune.py", line 206, in train adapters_weights = torch.load(checkpoint_name) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 815, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'.

No luck using /teelinsan/camoscio-7b-llama either.

I succeed in training the model without resume option, as well as in generating from the resulting model. Nevertheless I guess that training on few Italian prompts starting from the base model only (which is not Italian friendly) would not be effective, am I wrong?

Similar error results from generating from the same base model and using the adapter model stambecco as --lora_weights

It is worth saying that no error was produced when I tried the same approach (starting from the adapter model as a checkpoint) using https://github.com/zetavg/LLaMA-LoRA-Tuner

tloen / alpaca-lora

File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'. #588