Dijital-Twin / model

0 stars 0 forks source link

feat: Finetune Llama 2 7B #11

Closed t4r7k closed 6 months ago

t4r7k commented 6 months ago

Model Overview

Llama 2 7B, a state-of-the-art language model, is designed to excel in a variety of natural language processing tasks, including but not limited to text generation, comprehension, and summarization. This model is part of the larger Llama 2 series, which includes versions with varying sizes to cater to different computational needs and application scenarios. The 7B variant offers a balanced compromise between performance and resource efficiency, making it an ideal choice for developers and researchers seeking high-quality language understanding and generation capabilities without the extensive computational overhead of larger models.

The Llama 2 7B model has been trained on a diverse dataset, encompassing a wide range of topics and styles sourced from books, websites, and other text-rich mediums. This extensive training enables the model to understand and generate human-like text across various domains, demonstrating remarkable proficiency in language understanding, creative writing, and informative summarization.

Model URL

https://llama.meta.com

t4r7k commented 6 months ago

LLaMA Fine-tune Report

Commit URL

Introduction

This part of the project focused on exploring language models available on Huggingface and identifying suitable candidates for our specific use case. The selection criteria included the model's purpose, parameter count, output quality, training languages, and output languages. This report details the fine-tuning process of the meta-llama/Llama-2-7b-chat-hf model to adopt the personality traits of Rachel Green from the TV series "Friends".

Goal of This Issue

The primary objective was to fine-tune the LLaMA model to capture and reflect the personality of Rachel Green in its responses, thereby enhancing the relatability and authenticity of the generated text.


Fine-tune LLaMA-2-7b-chat-hf

We utilized dialogues from 227 episodes featuring Rachel for the fine-tuning process. The following sections summarize the parameters used for fine-tuning and showcase the model's performance on selected input sentences.

Model Huggingface URL

meta-llama/Llama-2-7b-chat-hf

Parameters and Information

Example Responses to Input Sentences

  1. Prompt: "Come on Rach, you don't have what, ten minutes?"

    • Response: "Yeah, I've got an hour."
  2. Prompt: "Rach, are you okay?"

    • Response: "Yeah, I'm fine. I'm just a little shaken up."
  3. Prompt: "Hey uh, Rach, funny story. I ah, bumped into Joanna on the street yesterday."

    • Response: "She said she’s been dating this guy, and he’s really nice."
  4. Prompt: "Rachel always cries!"

    • Response: "Rachel"
  5. Prompt: "We were on a break!"

    • Response: "It was not a break."
  6. Prompt: "Oh, Rachel, actually, y'know what, forget it."

    • Response: "I'm not gonna do that. I don't want to be the one to break up your relationship. I mean, I don't think that would be"
  7. Prompt: "Do you remember the day we had dessert?"

    • Response: "Oh, I remember it. Wow, that was a good day."

Configuration and Fine-tuning Details

QLoRA Parameters

BitsAndBytes Parameters

TrainingArguments Parameters

Supervised Fine-Tuning (SFT) Parameters

Evaluation

After comparing the fine-tuned LLaMA model with other candidates, it was observed that the LLaMA model provided a promising balance between personalization and output quality, effectively capturing the essence of Rachel Green's character. Despite LLaMA's strengths, the decision was made to use the Mistral model instead. This choice was based on Mistral's superior performance in benchmarks and the assessment that its answers were more suitable for our character's personality.