nuochenpku / COMEDY

This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
15 stars 6 forks source link

COMEDY

This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations


Overview

This repository contains resources for accessing the official benchmarks, codes, and checkpoints of the paper: "Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations".

This work pioneers exploring and building powerful Long-Term Conversation Dialogue Systems without retrieval. To accomplish this, we make the following works:

COMEDY VS. Retrieval-based Approaches

COMEDY adopts a groundbreaking ''One-for-All'' approach, utilizing a single, unified model to manage the entire process from memory generation, compression to final response generation for long-term memory dialogue generation.

πŸ€—Datasets

Our collected Dpolphin contain 3 tasks:

Usage & Download

πŸ€— Dolphin-train Dataset

πŸ€— Dolphin-DPO Dataset

πŸ€— [Dolphin-Test Dataset]

πŸ€— COMEDY-7B

πŸ€— COMEDY-13B-DPO

Table of Contents

Introduction

This work introduces a novel framework, COmpressive Memory-Enhanced Dialogue sYstems (COMEDY), which eschews traditional retrieval modules and memory databases. Instead, COMEDY adopts a "One-for-All" approach, utilizing a single language model to manage memory generation, compression, and response generation.

Installation

Clone this repository and install the required packages:

git clone https://github.com/nuochenpku/COMEDY.git
cd COMEDY
pip install -r requirements.txt

Training and Inference

Our training strategies include two stage: Mixed-task training and DPO Alignment

Data Loading

Run the following command to preprocess the data, like:

from datasets import load_dataset

dataset = load_dataset("Nuo97/Dolphin-DPO")

Quick Start

To play with our model, run:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Nuo97/COMEDY_7B")

input = string()
output = pipeline(input)[0]['generated_text']
print(output)

Step1: Mix-Tasked Training

bash run_step1.13B.sh

which consists of the following commands:


#!/bin/bash

# DeepSpeed Team

CURRENT_TIME=$(TZ=UTC-8 date +"%Y-%m-%d-%H.%M.%S")

ZERO_STAGE="--zero_stage 2"

MODEL_PATH=$1
OUTPUT=$2
LOG_PATH=$3

export TOKENIZERS_PARALLELISM=False
# export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"

# Reminder to shuffle train data in advance!
TRN_FN=$4
DEV_FN=$5

TOTAL_SIZE=`wc -l ${TRN_FN}`
echo "number of samples in trainset: ${TOTAL_SIZE}"

mkdir -p $OUTPUT/$CURRENT_TIME
deepspeed --include localhost:0,1,2,3,4,5,6,7 \
--master_port 12390 \
training/step1_supervised_finetuning/main.py \
   --model_name_or_path ${MODEL_PATH} \
   --train_data_path ${TRN_FN} \
   --valid_data_path ${DEV_FN} \
   --per_device_train_batch_size 4 \
   --per_device_eval_batch_size 4 \
   --data_output_path $OUTPUT/data \
   --max_seq_len 2048 \
   --learning_rate 1e-5  \
   --weight_decay 0.1 \
   --num_train_epochs 3 \
   --num_train_samples ${TOTAL_SIZE} \
   --gradient_accumulation_steps 1 \
   --lr_scheduler_type cosine \
   --num_warmup_steps 400 \
   --seed 42 \
   ${ZERO_STAGE} \
   --save_interval 2000 \
   --log_interval 100 \
   --eval_interval 1000 \
   --output_dir $OUTPUT/$CURRENT_TIME \
   --gradient_checkpointing \
   --tensorboard_path $LOG_PATH \
   &>$OUTPUT/train.log&

Step2: DPO Alignment

cd training/step2_dpo_training
bash training_scripts/single_node/run_memory.sh
#!/bin/bash
# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0
# local/xjsonfile/rftV2
# DeepSpeed Team
OUTPUT=$1
ZERO_STAGE=$2
DATA_PATH=$3
SFT_CKPT=$4

if [ "$OUTPUT" == "" ]; then
    OUTPUT=output/compress_memory/13b_v2_dpo_0.01_sft/
fi
if [ "$ZERO_STAGE" == "" ]; then
    ZERO_STAGE=3
fi
mkdir -p $OUTPUT

deepspeed --include localhost:0,1,2,3,4,5,6,7 --master_port=29592 main.py  \
   --data_path $DATA_PATH \
   --data_split 0,10,0 \
   --model_name_or_path $SFT_CKPT \
   --per_device_train_batch_size 1 \
   --per_device_eval_batch_size 2 \
   --max_seq_len 2048 \
   --learning_rate 1e-5  \
   --weight_decay 0. \
   --num_train_epochs 2  \
   --beta 0.01 \
   --gradient_accumulation_steps 1 \
   --lr_scheduler_type cosine \
   --num_warmup_steps 10 \
   --seed 1234 \
   --zero_stage $ZERO_STAGE \
   --deepspeed \
   --add_sft \
   --print_loss \
   --gradient_checkpointing \
   --output_dir $OUTPUT \
   --tensorboard_path $OUTPUT/runs \
   &> $OUTPUT/training.log 

Generation

To replicate the experimental results in our paper, run:

python comedy_test.py

Results

We recruit human annotators to evaluate the model performances in terms of Scoring and Ranking.

Overall Results on Human Scoring

Overall Results on Human Ranking

Citation

Please cite our paper if you use our data, model or code. Please also kindly cite the original dataset papers.

@misc{chen2024compress,
      title={Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations}, 
      author={Nuo Chen and Hongguang Li and Juhua Huang and Baoyuan Wang and Jia Li},
      year={2024},
      eprint={2402.11975},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}