Open jpodivin opened 1 month ago
Instead of calculating metrics, the first example of evaluation[1] fails since the tokenizer isn't provided nor inferred.
Exception: Impossible to guess which tokenizer to use. Please provide a PreTrainedTokenizer class or a path/identifier to a pretrained tokenizer.
To replicate, simply try to execute following:
from datasets import load_dataset from evaluate import evaluator from transformers import AutoModelForSequenceClassification, pipeline data = load_dataset("imdb", split="test").shuffle(seed=42).select(range(1000)) task_evaluator = evaluator("text-classification") # 1. Pass a model name or path eval_results = task_evaluator.compute( model_or_pipeline="lvwerra/distilbert-imdb", data=data, label_mapping={"NEGATIVE": 0, "POSITIVE": 1} ) # 2. Pass an instantiated model model = AutoModelForSequenceClassification.from_pretrained("lvwerra/distilbert-imdb") eval_results = task_evaluator.compute( model_or_pipeline=model, data=data, label_mapping={"NEGATIVE": 0, "POSITIVE": 1} )
evaluate===0.4.1
[1]https://huggingface.co/docs/evaluate/base_evaluator
Instead of calculating metrics, the first example of evaluation[1] fails since the tokenizer isn't provided nor inferred.
To replicate, simply try to execute following:
evaluate===0.4.1
[1]https://huggingface.co/docs/evaluate/base_evaluator