huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.16k stars 218 forks source link

No way to control show progress while doing hyperparameter search #442

Closed yahiaelgamal closed 9 months ago

yahiaelgamal commented 10 months ago

The show_progress argument is very useful. However, there is not way to pass it while doing hyperparamter search. This is very inconvenient for long running workflows.

tomaarsen commented 10 months ago

Hello!

I can imagine that this is indeed annoying. Luckily, this should be possible from v1.0.0 onwards, which should release within the next week. You will be able to do:


from setfit import SetFitModel

def model_init(params) -> SetFitModel:
    params = params or {}
    max_iter = params.get("max_iter", 100)
    solver = params.get("solver", "liblinear")
    params = {
        "head_params": {
            "max_iter": max_iter,
            "solver": solver,
        }
    }
    return SetFitModel.from_pretrained("BAAI/bge-small-en-v1.5", **params)

from optuna import Trial
from typing import Dict, Union

def hp_space(trial: Trial) -> Dict[str, Union[float, int, str]]:
    return {
        "body_learning_rate": trial.suggest_float("body_learning_rate", 1e-6, 1e-3, log=True),
        "num_epochs": trial.suggest_int("num_epochs", 1, 3),
        "batch_size": trial.suggest_categorical("batch_size", [16, 32, 64]),
        "seed": trial.suggest_int("seed", 1, 40),
        "max_iter": trial.suggest_int("max_iter", 50, 300),
        "solver": trial.suggest_categorical("solver", ["newton-cg", "lbfgs", "liblinear"]),
    }

from datasets import load_dataset
from setfit import Trainer, TrainingArguments, sample_dataset

dataset = load_dataset("SetFit/emotion")
train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8)
test_dataset = dataset["test"]

### Relevant for issue #442 ###
args = TrainingArguments(
    show_progress_bar=False,
)
###############################

trainer = Trainer(
    args=args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    model_init=model_init,
)
best_run = trainer.hyperparameter_search(direction="maximize", hp_space=hp_space, n_trials=10)
print(best_run)

trainer.apply_hyperparameters(best_run.hyperparameters, final_model=True)
trainer.train()

metrics = trainer.evaluate()
print(metrics)