Closed ravisurdhar closed 4 years ago
Yeah, that warning isn't the issue. Those outputs are definitely wrong but I'm not sure why without code. You should just be able to do this:
from transformers import pipeline
classifier = pipeline("zero-shot-classification")
sequence = "Who are you voting for in 2020?"
candidate_labels = ['2020', 'elections', 'foreign policy', 'business', 'Europe', 'politics', 'outdoor recreation']
hypothesis_template = "This text is about {}."
classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template, multi_class=True)
Check out this notebook for more examples.
Weird, that code does run fine for me and I get the expected result. I'm not quite sure how that's significantly different to the code that I have in my original post (it's in the collapsible section labeled code
above the last sentence). The only difference seems to be that I included these lines, which I basically copy/pasted from your demo app:
model_ids = {'Bart MNLI': 'facebook/bart-large-mnli'}
def load_models():
return {id: AutoModelForSequenceClassification.from_pretrained(id) for id in model_ids.values()}
models = load_models()
My only other thought is to make sure you're using the right tokenizer. If you pass an instantiated model object (rather than a string model identifier) to the pipeline factory, it can't infer which tokenizer to use so you have to pass tokenizer=tokenizer
in addition to the model. Otherwise I'll need the whole of your code to spot the issue.
This is my entire code:
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
model_ids = {'Bart MNLI': 'facebook/bart-large-mnli'}
device = -1
def load_models():
return {id: AutoModelForSequenceClassification.from_pretrained(id) for id in model_ids.values()}
models = load_models()
def load_tokenizer(tok_id):
return AutoTokenizer.from_pretrained(tok_id)
hypothesis_template = 'This text is about {{}}.'
def get_most_likely(nli_model_id, sequence, labels, hypothesis_template, multi_class=True):
classifier = pipeline('zero-shot-classification',
model=models[nli_model_id],
tokenizer=load_tokenizer(nli_model_id),
device=device)
outputs = classifier(sequence, labels, hypothesis_template, multi_class)
return outputs['labels'], outputs['scores']
test_seq = 'Who are you voting for in 2020?'
test_labels = [x.strip() for x in 'foreign policy, Europe, elections, business, 2020, outdoor recreation, politics'.strip().split(',')]
get_most_likely('facebook/bart-large-mnli', test_seq, test_labels, hypothesis_template)
And the output is:
(['2020',
'elections',
'foreign policy',
'business',
'Europe',
'politics',
'outdoor recreation'],
[0.021523168310523033,
0.021523168310523033,
0.021523168310523033,
0.021523164585232735,
0.021523164585232735,
0.021523121744394302,
0.021523121744394302])
I seemed to have found the issue, though I'm having a hard time understanding why this is the culprit: in my code, I have hypothesis_template = 'This text is about {{}}.'
, which I copied from line 34 in your code. However, in your first comment in this issue, you have hypothesis_template = "This text is about {}."
, with a single set of brackets. If I change my code above to have a single set of brackets, I get the expected results. Any idea why I'm seeing this behavior? I wouldn't expect the hypothesis_template
formatting to affect the results of the classifier...
Looks like you copied some unrendered markdown. {{}} should just be {}. It’s only double so it’s escaped.
I read through the source code for the ZeroShotClassificationPipeline and now it finally makes sense! My misunderstanding was that the hypothesis template was an output, ie the model would return a string with the format you specified and insert the label with the highest probability in the {}
. But now I understand that it's actually an input, meaning the classifier will return the probability that the sentence you supply is true given each label in the list of labels.
Thanks for helping me out! I'll go ahead and close this issue.
Hi, I'm trying to replicate the core functionality of your live demo app in a Jupyter notebook that strips out all of the Streamlit code, but I'm having trouble replicating the results.
Example input:
'Who are you voting for in 2020?'
Actual output:(['2020', 'elections', 'foreign policy', 'business', 'Europe', 'politics', 'outdoor recreation'], [0.021523168310523033, 0.021523168310523033, 0.021523168310523033, 0.021523164585232735, 0.021523164585232735, 0.021523121744394302, 0.021523121744394302])
As you can see, the probabilities for each label are virtually identical and all extremely low, while the live demo has the probabilities for the first three labels above 95% and the others at 0.4%. It seems like somehow the model is being loaded in an untrained state, and I'm getting the following error when I try to load either the
facebook/bart-large-mnli
model or thejoeddav/bart-large-mnli-yahoo-answers
model:However, some googling lead me to this issue, which makes it sound like this error is expected. Not sure why I'm getting the results I'm getting though. Any help would be greatly appreciated!
Code
``` from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model_ids = {'Bart MNLI': 'facebook/bart-large-mnli'} device = -1 def load_models(): return {id: AutoModelForSequenceClassification.from_pretrained(id) for id in model_ids.values()} models = load_models() def load_tokenizer(tok_id): return AutoTokenizer.from_pretrained(tok_id) hypothesis_template = 'This text is about {{}}.' def get_most_likely(nli_model_id, sequence, labels, hypothesis_template, multi_class=True): classifier = pipeline('zero-shot-classification', model=models[nli_model_id], tokenizer=load_tokenizer(nli_model_id), device=device) outputs = classifier(sequence, labels, hypothesis_template, multi_class) return outputs['labels'], outputs['scores'] test_seq = 'Who are you voting for in 2020?' test_labels = [x.strip() for x in 'foreign policy, Europe, elections, business, 2020, outdoor recreation, politics'.strip().split(',')] get_most_likely('facebook/bart-large-mnli', test_seq, test_labels, hypothesis_template) ```
I'm using Python 3.6, torch 1.6.0 (in non-CUDA mode), and transformers 3.1.0 on OS X 10.15.6.