Closed JetRunner closed 4 years ago
I'm working on a fix now.
This bug occurs irrespective transformer
version I checked it for 2.8.0, 2.90 and 3.0.1
Pipeline returns incorrect output only when the model and tokenizer classes are used to initialize the pipeline.
If you use model and tokernizer parameters as path instead in form of string. The output is fine. Following snippet demonstrates this :
from transformers import RobertaModel, RobertaTokenizer, RobertaConfig
from transformers import pipeline
MODEL_PATH = 'roberta-base'
model = RobertaModel.from_pretrained(MODEL_PATH)
tokenizer = RobertaTokenizer.from_pretrained(MODEL_PATH)
fill_from_path = pipeline(
'fill-mask',
model=MODEL_PATH,
tokenizer=MODEL_PATH
)
fill_from_model = pipeline(
'fill-mask',
model=model,
tokenizer=tokenizer
)
seq = 'I found a bug in <mask>'
print(fill_from_path(seq))
print(fill_from_model(seq))
The output is the following. You can see the first output is fine where we used the model paths, but the second output where we provided the model and tokenizer classes has a problem.
[{'sequence': '<s> I found a bug in Firefox</s>', 'score': 0.051126863807439804, 'token': 30675}, {'sequence': '<s> I found a bug in Gmail</s>', 'score': 0.027283240109682083, 'token': 29004}, {'sequence': '<s> I found a bug in Photoshop</s>', 'score': 0.024683473631739616, 'token': 35197}, {'sequence': '<s> I found a bug in Java</s>', 'score': 0.021543316543102264, 'token': 24549}, {'sequence': '<s> I found a bug in Windows</s>', 'score': 0.018485287204384804, 'token': 6039}]
[{'sequence': '<s> I found a bug in real</s>', 'score': 0.9705745577812195, 'token': 588}, {'sequence': '<s> I found a bug in here</s>', 'score': 0.00013350950030144304, 'token': 259}, {'sequence': '<s> I found a bug in within</s>', 'score': 6.807789031881839e-05, 'token': 624}, {'sequence': '<s> I found a bug in San</s>', 'score': 6.468965875683352e-05, 'token': 764}, {'sequence': '<s> I found a bug in 2015</s>', 'score': 6.282260437728837e-05, 'token': 570}]
@ashutosh-dwivedi-e3502 Try changing this line model = RobertaModel.from_pretrained(MODEL_PATH)
into model = AutoModelForMaskedLM.from_pretrained(MODEL_PATH)
@JuhaKiili That fixes it. Output with model = AutoModelForMaskedLM.from_pretrained(MODEL_PATH)
is :
Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/Users/asdwivedi/.virtualenvs/test-demo-TklxO9OB/lib/python3.8/site-packages/transformers/modeling_auto.py:796: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
warnings.warn(
Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[{'sequence': '<s>I found a bug in Firefox</s>', 'score': 0.05709619075059891, 'token': 30675, 'token_str': 'ĠFirefox'}, {'sequence': '<s>I found a bug in Gmail</s>', 'score': 0.03430333733558655, 'token': 29004, 'token_str': 'ĠGmail'}, {'sequence': '<s>I found a bug in WordPress</s>', 'score': 0.028388172388076782, 'token': 33398, 'token_str': 'ĠWordPress'}, {'sequence': '<s>I found a bug in Java</s>', 'score': 0.02571324072778225, 'token': 24549, 'token_str': 'ĠJava'}, {'sequence': '<s>I found a bug in Python</s>', 'score': 0.01953786611557007, 'token': 31886, 'token_str': 'ĠPython'}]
[{'sequence': '<s>I found a bug in Firefox</s>', 'score': 0.05709619075059891, 'token': 30675, 'token_str': 'ĠFirefox'}, {'sequence': '<s>I found a bug in Gmail</s>', 'score': 0.03430333733558655, 'token': 29004, 'token_str': 'ĠGmail'}, {'sequence': '<s>I found a bug in WordPress</s>', 'score': 0.028388172388076782, 'token': 33398, 'token_str': 'ĠWordPress'}, {'sequence': '<s>I found a bug in Java</s>', 'score': 0.02571324072778225, 'token': 24549, 'token_str': 'ĠJava'}, {'sequence': '<s>I found a bug in Python</s>', 'score': 0.01953786611557007, 'token': 31886, 'token_str': 'ĠPython'}]
🐛 Bug
Information
Model I am using (Bert, XLNet ...): CodeBERT
Language I am using the model on (English, Chinese ...): Code
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
This is the right code and right outputs:
Output:
But if we load the model with
RobertaModel
and proceed with the same pipeline:Then the output makes no sense at all:
transformers
version: 3.0.1