The transformers pipeline runs each input sequentially with batch size 1, and in the post-processing step when using top_k, it requires a 1D tensor to iterate over. This PR flattens the output of the model if it's batch size one. I've tested this:
from fastfit import FastFit
from transformers import AutoTokenizer, pipeline
model = FastFit.from_pretrained("../fast-fit")
tokenizer = AutoTokenizer.from_pretrained("roberta-large")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer,)
# This worked before and still works
print(classifier("I love this package!"))
# This failed before
print(classifier(["When do you think my card will arrive in Sweden?", "Give me back my money!"], top_k=10))
# This still works
x = tokenizer(["Hi", "Hello"], return_tensors="pt")
print(model(x["input_ids"], x["attention_mask"]))
# This will be different
x = tokenizer(["Hi"], return_tensors="pt")
print(model(x["input_ids"], x["attention_mask"])) # This is a 1D tensor now
I wasn't able to find another implementation of a pipeline-compatible model that I could piggyback off of. Do you see any potential problems with changing the shape of the output for single size batches? Am I missing an obvious solution?
This is the problematic code:
# From transformers/pipelines/text_classification.py in TextClassificationPipeline.postprocess
# score needs to be 1D
dict_scores = [
{"label": self.model.config.id2label[i], "score": score.item()} for i, score in enumerate(scores)
]
The transformers pipeline runs each input sequentially with batch size 1, and in the post-processing step when using
top_k
, it requires a 1D tensor to iterate over. This PR flattens the output of the model if it's batch size one. I've tested this:I wasn't able to find another implementation of a pipeline-compatible model that I could piggyback off of. Do you see any potential problems with changing the shape of the output for single size batches? Am I missing an obvious solution?
This is the problematic code:
Closes #4