dataset = load_dataset("sst2")
train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8)
eval_dataset = dataset["validation"]
model = SetFitModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
training_args = SetFitTrainingArguments(
loss=CosineSimilarityLoss,
batch_size=16,
num_iterations=5,
num_epochs=1,
report_to="none",
)
# TODO: Remove this once https://github.com/huggingface/setfit/issues/512
# is resolved. This is a workaround during the deprecation of the
# evaluation_strategy argument is being addressed in the SetFit library.
training_args.eval_strategy = training_args.evaluation_strategy
trainer = SetFitTrainer(
model=model,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
metric="accuracy",
column_mapping={"sentence": "text", "label": "label"},
args=training_args,
)
SetFitTrainer constructor Raises error:
AttributeError: 'CallbackHandler' object has no attribute 'tokenizer'
Repro:
transformers:
pip install git+https://github.com/huggingface/transformers
setfit:pip install setfit==1.1.0
code:
SetFitTrainer
constructor Raises error:Stack trace: