IBM / fastfit

FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
Apache License 2.0
183 stars 13 forks source link

fastfit_banner_white

FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of large language models via API calls, FastFit significantly improves multi-class classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds.

Running the Training Script

Our package provides a convenient command-line tool train_fastfit to train text classification models. This tool comes with a variety of configurable parameters to customize your training process.

Prerequisites

Before running the training script, ensure you have Python installed along with our package and its dependencies. If you haven't already installed our package, you can do so using pip:

pip install fast-fit

Usage

To run the training script with custom configurations, use the train_fastfit command followed by the necessary arguments similar to huggingface training args with few additions relevant for fast-fit.

Example Command

Here's an example of how to use the run_train command with specific settings:

train_fastfit \
    --model_name_or_path "sentence-transformers/paraphrase-mpnet-base-v2" \
    --train_file $TRAIN_FILE \
    --validation_file $VALIDATION_FILE \
    --output_dir ./tmp/try \
    --overwrite_output_dir \
    --report_to none \
    --label_column_name label\
    --text_column_name text \
    --num_train_epochs 40 \
    --dataloader_drop_last true \
    --per_device_train_batch_size 32 \
    --per_device_eval_batch_size 64 \
    --evaluation_strategy steps \
    --max_text_length 128 \
    --logging_steps 100 \
    --dataloader_drop_last=False \
    --num_repeats 4 \
    --save_strategy no \
    --optim adafactor \
    --clf_loss_factor 0.1 \
    --do_train \
    --fp16 \
    --projection_dim 128

Output

Upon execution, train_fastfit will start the training process based on your parameters and output the results, including logs and model checkpoints, to the designated directory.

Training with python

You can simply run it with your python

from datasets import load_dataset
from fastfit import FastFitTrainer, sample_dataset

# Load a dataset from the Hugging Face Hub
dataset = load_dataset("FastFit/banking_77")
dataset["validation"] = dataset["test"]

# Down sample the train data for 5-shot training
dataset["train"] = sample_dataset(dataset["train"], label_column="label", num_samples_per_label=5)

trainer = FastFitTrainer(
    model_name_or_path="sentence-transformers/paraphrase-mpnet-base-v2",
    label_column_name="label",
    text_column_name="text",
    num_train_epochs=40,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=64,
    max_text_length=128,
    dataloader_drop_last=False,
    num_repeats=4,
    optim="adafactor",
    clf_loss_factor=0.1,
    fp16=True,
    dataset=dataset,
)

model = trainer.train()
results = trainer.evaluate()

print("Accuracy: {:.1f}".format(results["eval_accuracy"] * 100))

Then the model can be saved:

model.save_pretrained("fast-fit")

Then you can use the model for inference

from fastfit import FastFit
from transformers import AutoTokenizer, pipeline

model = FastFit.from_pretrained("fast-fit")
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

print(classifier("I love this package!"))

All avialble parameters:

Optional Arguments: