LLaMA Fine-tune Report

Introduction

This part of the project focused on exploring language models available on Huggingface and identifying suitable candidates for our specific use case. The selection criteria included the model's purpose, parameter count, output quality, training languages, and output languages. This report details the fine-tuning process of the meta-llama/Llama-2-7b-chat-hf model to adopt the personality traits of Rachel Green from the TV series "Friends".

Goal of This Issue

The primary objective was to fine-tune the LLaMA model to capture and reflect the personality of Rachel Green in its responses, thereby enhancing the relatability and authenticity of the generated text.

Fine-tune LLaMA-2-7b-chat-hf

We utilized dialogues from 227 episodes featuring Rachel for the fine-tuning process. The following sections summarize the parameters used for fine-tuning and showcase the model's performance on selected input sentences.

Model Huggingface URL

meta-llama/Llama-2-7b-chat-hf

Parameters and Information

Epoch / sec: 1069 sec
Epoch: 1
Loss: start: 3.110400 / end: 1.889900
Batch Size: 4

Example Responses to Input Sentences

Prompt: "Come on Rach, you don't have what, ten minutes?"
- Response: "Yeah, I've got an hour."
Prompt: "Rach, are you okay?"
- Response: "Yeah, I'm fine. I'm just a little shaken up."
Prompt: "Hey uh, Rach, funny story. I ah, bumped into Joanna on the street yesterday."
- Response: "She said she’s been dating this guy, and he’s really nice."
Prompt: "Rachel always cries!"
- Response: "Rachel"
Prompt: "We were on a break!"
- Response: "It was not a break."
Prompt: "Oh, Rachel, actually, y'know what, forget it."
- Response: "I'm not gonna do that. I don't want to be the one to break up your relationship. I mean, I don't think that would be"
Prompt: "Do you remember the day we had dessert?"
- Response: "Oh, I remember it. Wow, that was a good day."

Configuration and Fine-tuning Details

QLoRA Parameters

LoRA Attention Dimension: 64
LoRA Alpha: 16
LoRA Dropout: 0.1

BitsAndBytes Parameters

4-bit Precision Loading: Enabled
Compute DType for 4-bit Base Models: float16
Quantization Type: nf4
Nested Quantization: Disabled

TrainingArguments Parameters

Output Directory: ./results
Number of Training Epochs: 1
FP16/BF16 Training: BF16 set to False
Per Device Train Batch Size: 4
Gradient Accumulation Steps: 1
Learning Rate: 2e-4
Optimizer: paged_adamw_32bit
Learning Rate Scheduler: Constant
Warmup Ratio: 0.03
Save Steps: 25
Logging Steps: 25

Supervised Fine-Tuning (SFT) Parameters

Maximum Sequence Length: Not specified
Packing: Disabled
Device Map: Entire model on GPU 0

Evaluation

After comparing the fine-tuned LLaMA model with other candidates, it was observed that the LLaMA model provided a promising balance between personalization and output quality, effectively capturing the essence of Rachel Green's character. Despite LLaMA's strengths, the decision was made to use the Mistral model instead. This choice was based on Mistral's superior performance in benchmarks and the assessment that its answers were more suitable for our character's personality.

Dijital-Twin / model

feat: Finetune Llama 2 7B #11

Model Overview

Model URL