Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args

Hi, I trained a model with SetFitTrainer and afterwards started a hyperparameter optimization with the same parameters for testing reasons. The expected behaviour would be both of them having the same number of unique pairs and therefore taking roughly the same time. But in reality the direkt train approach had 64240 unique pairs, 4015 optimization steps and took 30 minutes per epoch, while the optimization had 2039350 unique pairs, 127460 optimization steps and was about too take 19 hours.

Training task:

model = SetFitModel.from_pretrained(
    'sentence-transformers/paraphrase-mpnet-base-v2',
    multi_target_strategy='multi-output'
)

trainer = SetFitTrainer(
    model=model,
    train_dataset=datasets['train'],
    eval_dataset=datasets['validation'],
    loss_class=CosineSimilarityLoss,
    batch_size=16,
    num_iterations=20,
    num_epochs=1
)

trainer.train()

Optimization task:

def model_init(params: Dict[str, Any]) -> SetFitModel:
    params = params or {}
    max_iter = params.get('max_iter', 100)
    solver = params.get('solver', 'liblinear')
    params = {
        'head_params': {
            'max_iter': max_iter,
            'solver': solver
        }
    }

    return SetFitModel.from_pretrained('sentence-transformers/paraphrase-mpnet-base-v2', multi_target_strategy='multi-output')

def hp_space(trial: Trial) -> Dict[str, Union[float, int, str]]:
    return {
        "body_learning_rate": trial.suggest_float("body_learning_rate", 1e-5, 1e-5, log=True),
        "num_epochs": trial.suggest_int("num_epochs", 1, 1),
        "batch_size": trial.suggest_categorical("batch_size", [16]),
        "seed": trial.suggest_int("seed", 42, 42),
        "max_iter": trial.suggest_int("max_iter", 20, 20),
        "solver": trial.suggest_categorical("solver", ["liblinear"]),
    }

trainer = Trainer(
    train_dataset=datasets['train'],
    eval_dataset=datasets['validation'],
    model_init=model_init
)

best_run = trainer.hyperparameter_search(direction="maximize", hp_space=hp_space, n_trials=1)

The data consists of datasets with the columns 'text' and 'label', where 'text' is a string and 'label' a tensor of following format: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]. Although that should not be relevant for this issue.

In my understanding, both of them should be comparable in complexity of the training task as the used parameters are the same. What ist the explanation for this behaviour and is there a possibility to recreate the situation in the training task in the optimization task?

Thank you in advance!

huggingface / setfit

Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545