DialoGPT Fine-tune Report

Introduction

This project involved exploring and fine-tuning language models available on Huggingface for generating responses that capture the essence of Rachel Green from the TV series "Friends". The DialoGPT medium and large models were chosen for this task due to their capabilities in generating human-like text.

Goal of This Issue

The aim was to fine-tune DialoGPT models to accurately reflect Rachel's personality in their responses, ensuring the outputs feel authentic and character-consistent.

Fine-tune DialoGPT Medium

Dialogues featuring Rachel from 227 episodes were used to fine-tune the DialoGPT medium model. The fine-tuning aimed at enhancing the model's ability to mimic Rachel's speaking style and character nuances.

Model Huggingface URL

microsoft/DialoGPT-medium

Training Parameters

Learning Rate: 5e-5
Weight Decay: 0.0
Adam Epsilon: 1e-8
Max Grad Norm: 1.0
Num Train Epochs: 3
Per GPU Train Batch Size: 4
Gradient Accumulation Steps: 1

Example Responses to Input Sentences

Prompt: "Come on Rach, you don't have what, ten minutes?"
- Response: "i'll take ten minutes."
Prompt: "Rach, are you okay?"
- Response: "yeah, i’m fine."
Prompt: "Hey uh, Rach, funny story. I ah, bumped into Joanna on the street yesterday."
- Response: "oh, what did she say?"
Prompt: "Rachel always cries!"
- Response: "you know what i don't cry? when i get a little bit of a hangover from last night, i cry."
Prompt: "We were on a break!"
- Response: "okay, i’ll just keep it down, okay?"
Prompt: "Oh, Rachel, actually, y'know what, forget it."
- Response: "you wanna go on a date?"
Prompt: "Do you remember the day we had dessert?"
- Response: "uh-huh."

Fine-tune DialoGPT Large

The DialoGPT large model was also fine-tuned with the same dataset to compare its performance against the medium model in capturing Rachel's character.

Model Huggingface URL

microsoft/DialoGPT-large

Training Parameters

Learning Rate: 5e-5
Weight Decay: 0.0
Adam Epsilon: 1e-8
Max Grad Norm: 1.0
Num Train Epochs: 3
Per GPU Train Batch Size: 4
Gradient Accumulation Steps: 1

Example Responses to Input Sentences

Prompt: "Come on Rach, you don't have what, ten minutes?"
- Response: "i have ten minutes! okay, let's—let's talk about you."
Prompt: "Rach, are you okay?"
- Response: "yeah, i’m fine! i’m fine! i’m fine! yeah, i’m fine!"
Prompt: "Hey uh, Rach, funny story. I ah, bumped into Joanna on the street yesterday."
- Response: "yeah?"
Prompt: "Rachel always cries!"
- Response: "you know what? i think that's the best part. you know, she's finally admitting her flaws."
Prompt: "We were on a break!"
- Response: "you broke up with tag?"
Prompt: "Oh, Rachel, actually, y'know what, forget it."
- Response: "what? what?"
Prompt: "Do you remember the day we had dessert?"
- Response: "oh yes, i remember it fondly."

Evaluation

Upon evaluating both versions of the fine-tuned DialoGPT models, it was observed that while both models effectively captured the essence of Rachel Green's character, the DialoGPT large model demonstrated a slightly more nuanced understanding of the character's personality and was able to generate responses with greater depth. Hovewer, although these models are good in single-turn conversations, demonstrating a nuanced understanding of Rachel Green's personality and generating responses with greater depth, they significantly underperform in multi-turn dialogues. Their inability to effectively refer back to previous context in these extended exchanges undermines their suitability as a chatbot for more dynamic interactions. That said, we decided to use other models for the project.

Dijital-Twin / model

feat: Finetune DialoGPT Medium & Large #9

Model Overview

Model URL

DialoGPT Fine-tune Report

Introduction

Goal of This Issue

Fine-tune DialoGPT Medium

Model Huggingface URL

Training Parameters

Example Responses to Input Sentences

Fine-tune DialoGPT Large

Model Huggingface URL

Training Parameters

Example Responses to Input Sentences

Evaluation