Closed turmeric-blend closed 3 years ago
Replace AutoModel
with AutoModelForSequenceClassification
. The former won't add the sequence classification head, i.e. it will use BartModel
instead of BartForSequenceClassification
, so the pipeline is trying to use just the outputs of the encoder instead of the NLI predictions in your snippet.
@joeddav that fixed it thanks !
Have the same problem:
conda environment: Python 3.7.9
pip3 install torch==1.6
pip3 install transformers
Running
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("facebook/bart-large-mnli")
Results in message:
Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartForSequenceClassification: ['model.encoder.version', 'model.decoder.version']
- This IS expected if you are initializing BartForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
@turmeric-blend: How is my setup different from yours?
actually the error message was still there after the fix, but the scores running on local machine were consistent with the online demo @gustavengstrom
any ideas why is there still the warning message @joeddav ?
Yeah that warning isn't a concern. It's just letting you know that some of the parameters checkpointed in the pretrained model were not able to be matched with the model class, but in this case it's just a couple of meta-fields (encoder/decoder version), so your weights should be matched up fine.
Environment info
transformers
version: 3.4.0Who can help
@sshleifer
Information
Model I am using (Bert, XLNet ...): facebook/bart-large-mnli
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
First I tried the hosted demo online at huggingface, which gives me a very high score of 0.99 for travelling (as expected):
Then I tried to run the code on my local machine, which returns very different scores for all labels (poor scores):
I got this warning message when initializing the model:
model = AutoModel.from_pretrained("facebook/bart-large-mnli")
Expected behavior
Code on my local machine's score to be quite similar to the online demo.