huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
9.85k stars 1.25k forks source link

Low loss but can't get the expected output during inference #1790

Closed zczlsde closed 2 months ago

zczlsde commented 3 months ago

Hi, I am training the LLaMa3 using the SFTtrainer to fit only one data point. The training arguments are shown below:

`base_model = "meta-llama/Meta-Llama-3-8B"

dataset = load_dataset("json", data_files="data.json", split="train") compute_dtype = getattr(torch, "float16")

quant_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=compute_dtype, bnb_4bit_use_double_quant=False, ) model = AutoModelForCausalLM.from_pretrained( base_model,

torch_dtype=torch.bfloat16,

quantization_config=quant_config,
device_map="auto"

)

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True) tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "right"

peft_params = LoraConfig( lora_alpha=32, lora_dropout=0.05, r=16, bias="none", task_type="CAUSAL_LM", )

training_params = TrainingArguments( output_dir="./results_1", num_train_epochs=100, per_device_train_batch_size=1, gradient_accumulation_steps=1, optim="adafactor", save_steps=50, logging_steps=1, learning_rate=2e-4, weight_decay=0.001, fp16=False, bf16=False, max_grad_norm=0.3, max_steps=-1, warmup_ratio=0.03, group_by_length=True, lr_scheduler_type="constant", report_to="tensorboard" )

trainer = SFTTrainer( model=model, train_dataset=dataset, peft_config=peft_params, dataset_text_field="text", max_seq_length=None, tokenizer=tokenizer, dataset_batch_size = 1, args=training_params, packing=False, )

trainer.train()

refined_model = "LLM" trainer.model.save_pretrained(refined_model) tokenizer.save_pretrained(refined_model)`

Loss decreased from 1.6 to 0.0 and the model is converged. Then I load the fine-tuned model and use the same data point in the training as the input. However, it can't give me the expected output for this data point used in fine-tuning. Does anyone have an idea what causes this problem? The loss is 0.0 and it should give me the correct output.

The code for inference is below: `refined_model = "LLM"

from huggingface_hub import login login(token = "") tokenizer = AutoTokenizer.from_pretrained(refined_model)

model = AutoModelForCausalLM.from_pretrained(refined_model, device_map="auto")

input_text = "Your task is......" input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids, do_sample = False) print(tokenizer.decode(outputs[0]))`

The current output is very different from the ground truth data point.

zhuqiangLu commented 3 months ago

I am actually getting the same issue.

Also, you are leaving your hf token in your question.

zczlsde commented 3 months ago

I am actually getting the same issue.

Also, you are leaving your hf token in your question.

Oh, thx. You can check the parameters for generation, it could be solved by changing some params.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.