philschmid / deep-learning-pytorch-huggingface

MIT License
580 stars 138 forks source link

What's the use of "messages" in dpo step? #48

Open katopz opened 4 months ago

katopz commented 4 months ago

Refer to: https://github.com/philschmid/deep-learning-pytorch-huggingface/blob/main/training/dpo-align-llms-in-2024-with-trl.ipynb

for prompt in prompts:
  # 👇 No use?
  messages = pipe.tokenizer.apply_chat_template([{"role":"user", "content": prompt}], tokenize=False)
  outputs = pipe(prompt, max_new_tokens=2048, do_sample=True, temperature=1.0, top_k=50, top_p=0.9, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
  print(f"**Prompt**:\n{prompt}\n")
  print(f"**Generated Answer**:\n{outputs[0]['generated_text'][len(prompt):].strip()}")
  print("===" * 10)

There's no use here and after?