Closed tamanna-mostafa closed 9 months ago
Hi @tamanna-mostafa 👋
You'll have to help us figure out what's wrong: can you get us a short and reproducible script that showcases the issue on the transformers
size? I see two exceptions in your pasted code, one about text-generation-inference
and another about safetensors
@gante Thanks for your comments. Here are the codes I ran (please let me know if you need any further details):
#Config for SFT
mistral-7b-sft-MM-RLAIF:
dtype: bf16
log_dir: "mistral-7b-sft-MM-PS"
learning_rate: 2e-5
model_name: /mnt/efs/workspace/sakhaki/models/Mistral-7B-v0.1
deepspeed_config: configs/zero_config_sft_65b.json #configs/zero_config_pretrain.json
output_dir: /mnt/efs/data/tammosta/files_t/output_sft_32k
weight_decay: 0.01
max_length: 4096
warmup_steps: 100
gradient_checkpointing: true
gradient_accumulation_steps: 8
per_device_train_batch_size: 1
per_device_eval_batch_size: 1
eval_steps: 500000
save_steps: 100
num_train_epochs: 2
save_total_limit: 4
use_flash_attention: false
residual_dropout: 0.0
residual_dropout_lima: true
save_strategy: steps
peft_model: false
only_last_turn_loss: false
use_custom_sampler: true
datasets:
- sft-custom:
data_files: /mnt/efs/data/tammosta/files_t/SFT_inp_26787_RBS_plus_Optima.json
#fraction : 0.75
max_val_set: 300
val_split: 0.0001
- oasst_export:
lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk" # sft-8.0
hf_dataset_name: OpenAssistant/oasst1
fraction : 0.5
val_split: 0.0001
max_val_set: 300
top_k: 1
#run SFT on mistral 7b model
deepspeed trainer_sft_d.py --configs mistral-7b-sft-MM-RLAIF --wandb-entity tammosta --show_dataset_stats --deepspeed
#Run DPO on the SFT model
accelerate launch --config_file ./accelerate_configs/ds_zero3.yaml rlhf_dpo.py \
--model_name_or_path="/mnt/efs/data/tammosta/files_t/output_sft_32k" \
--output_dir="/mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k" \
--data_path="/mnt/efs/data/tammosta/files_t/DPO_data_rbs_clean_AIF.json" \
--use_lamma2_peft_config False \
--beta 0.1 \
--optimizer_type adamw_hf \
--learning_rate 1e-6 \
--warmup_steps 50 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 8 \
--lora_alpha 16 \
--lora_dropout 0.05 \
--lora_r 8 \
--max_prompt_length 2048 \
--max_length 4096 \
--num_train_epochs 4 \
--logging_steps 20 \
--save_steps 100 \
--save_total_limit 8 \
--eval_steps 50 \
--gradient_checkpointing True \
--report_to "wandb"
ubuntu@ip-172-31-8-218:/mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k$ ls
README.md adapter_model.safetensors checkpoint-100 checkpoint-300 checkpoint-500 checkpoint-700 global_step736 special_tokens_map.json tokenizer.model training_args.bin
adapter_config.json added_tokens.json checkpoint-200 checkpoint-400 checkpoint-600 final_checkpoint latest tokenizer.json tokenizer_config.json zero_to_fp32.py
# merge the lora adaptors
python merge_peft_adaptors_gpu.py --base_model_name_or_path /mnt/efs/data/tammosta/files_t/output_sft_32k --peft_model_path /mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k --output_dir /mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k_merged --safe_serialization
#Content of merge_peft_adators_gpu.py
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import os
import argparse
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument("--base_model_name_or_path", type=str)
parser.add_argument("--peft_model_path", type=str)
parser.add_argument("--output_dir", type=str)
parser.add_argument("--device", type=str, default="auto")
parser.add_argument("--safe_serialization", action="store_true")
return parser.parse_args()
####
def main():
args = get_args()
if args.device == 'auto':
device_arg = { 'device_map': 'auto' }
else:
device_arg = { 'device_map': { "": args.device} }
print(f"Loading base model: {args.base_model_name_or_path}")
base_model = AutoModelForCausalLM.from_pretrained(
args.base_model_name_or_path,
return_dict=True,
torch_dtype=torch.float16,
trust_remote_code=True,
**device_arg
)
#device = torch.device('cpu')
#base_model.to(device)
print(f"Loading PEFT: {args.peft_model_path}")
model = PeftModel.from_pretrained(base_model, args.peft_model_path)
print("Peft Model : ", model.device)
print(f"Running merge_and_unload")
model = model.merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained(args.base_model_name_or_path)
model.save_pretrained(f"{args.output_dir}",max_shard_size='9GB',safe_serialization=args.safe_serialization)
tokenizer.save_pretrained(f"{args.output_dir}",max_shard_size='9GB',safe_serialization=args.safe_serialization)
print(f"Model saved to {args.output_dir}")
####
if __name__ == "__main__" :
main()
#The error I get while running the code above
Loading base model: /mnt/efs/data/tammosta/files_t/output_sft_32k
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.40s/it]
Loading PEFT: /mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k
Traceback (most recent call last):
File "/mnt/efs/data/tammosta/scripts_hb/merge_peft_adaptors_gpu.py", line 51, in <module>
main()
File "/mnt/efs/data/tammosta/scripts_hb/merge_peft_adaptors_gpu.py", line 38, in main
model = PeftModel.from_pretrained(base_model, args.peft_model_path)
File "/opt/conda/envs/ml_v4/lib/python3.10/site-packages/peft/peft_model.py", line 352, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/opt/conda/envs/ml_v4/lib/python3.10/site-packages/peft/peft_model.py", line 689, in load_adapter
adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
File "/opt/conda/envs/ml_v4/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 270, in load_peft_weights
adapters_weights = safe_load_file(filename, device=device)
File "/opt/conda/envs/ml_v4/lib/python3.10/site-packages/safetensors/torch.py", line 308, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
Hi @tamanna-mostafa 👋 looking at your stack trace, it looks like a peft
error, you should open an issue there :)
@gante Issue opened here: https://github.com/huggingface/peft/issues/1443
Closing this ticket since the issue is reported in the ticket above.
System Info
Who can help?
@gante @Rocketknight1 @muellerzr and @pacman100
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
However, the docker run failed with the following error:
OSError: /data/DPO_output_mistral_32k does not appear to have a file named config.json. Checkout 'https://huggingface.co//data/DPO_output_mistral_32k/None' for available files.
python merge_peft_adaptors_gpu.py --base_model_name_or_path /mnt/efs/data/tammosta/files_t/output_sft_32k --peft_model_path /mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k --output_dir /mnt/efs/data/tammosta/files_t/DPO_output_mistral_32k_merged --safe_serialization
Here is the content of
merge_peft_adaptors_gpu.py
:However, I'm getting this error:
Any idea why I'm getting this error?
Expected behavior
The merged model will successfully load in the output directory.