huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.96k stars 27k forks source link

Weird output when using unexpected model type for pipelines #5678

Closed JetRunner closed 4 years ago

JetRunner commented 4 years ago

🐛 Bug

Information

Model I am using (Bert, XLNet ...): CodeBERT

Language I am using the model on (English, Chinese ...): Code

The problem arises when using:

The tasks I am working on is:

To reproduce

Steps to reproduce the behavior:

This is the right code and right outputs:

from transformers import RobertaConfig, RobertaTokenizer, RobertaForMaskedLM, pipeline

model = RobertaForMaskedLM.from_pretrained('microsoft/codebert-base-mlm')
tokenizer = RobertaTokenizer.from_pretrained('microsoft/codebert-base-mlm')

CODE = "if (x is not None) <mask> (x>1)"
fill_mask = pipeline('fill-mask', model=model, tokenizer=tokenizer)

outputs = fill_mask(CODE)
print(outputs)

Output:

[{'sequence': '<s>if (x is not None) and(x>1)</s>', 'score': 0.7236990928649902, 'token': 8, 'token_str': 'Ġand'}, {'sequence': '<s>if (x is not None) &(x>1)</s>', 'score': 0.10633797943592072, 'token': 359, 'token_str': 'Ġ&'}, {'sequence': '<s>if (x is not None)and(x>1)</s>', 'score': 0.021604137495160103, 'token': 463, 'token_str': 'and'}, {'sequence': '<s>if (x is not None) AND(x>1)</s>', 'score': 0.02122747339308262, 'token': 4248, 'token_str': 'ĠAND'}, {'sequence': '<s>if (x is not None) if(x>1)</s>', 'score': 0.016991324722766876, 'token': 114, 'token_str': 'Ġif'}]

But if we load the model with RobertaModel and proceed with the same pipeline:

from transformers import RobertaConfig, RobertaTokenizer, RobertaModel, pipeline

model = RobertaModel.from_pretrained('microsoft/codebert-base-mlm')
tokenizer = RobertaTokenizer.from_pretrained('microsoft/codebert-base-mlm')

CODE = "if (x is not None) <mask> (x>1)"
fill_mask = pipeline('fill-mask', model=model, tokenizer=tokenizer)

outputs = fill_mask(CODE)
print(outputs)

Then the output makes no sense at all:

[{'sequence': '<s>if (x is not None) real(x>1)</s>', 'score': 0.9961338043212891, 'token': 588, 'token_str': 'Ġreal'}, {'sequence': '<s>if (x is not None)n(x>1)</s>', 'score': 1.70519979292294e-05, 'token': 282, 'token_str': 'n'}, {'sequence': '<s>if (x is not None) security(x>1)</s>', 'score': 1.5919968063826673e-05, 'token': 573, 'token_str': 'Ġsecurity'}, {'sequence': '<s>if (x is not None) Saturday(x>1)</s>', 'score': 1.5472969607799314e-05, 'token': 378, 'token_str': 'ĠSaturday'}, {'sequence': '<s>if (x is not None) here(x>1)</s>', 'score': 1.543204598419834e-05, 'token': 259, 'token_str': 'Ġhere'}]
JetRunner commented 4 years ago

I'm working on a fix now.

ashutosh-dwivedi-e3502 commented 4 years ago

This bug occurs irrespective transformer version I checked it for 2.8.0, 2.90 and 3.0.1

Pipeline returns incorrect output only when the model and tokenizer classes are used to initialize the pipeline.

If you use model and tokernizer parameters as path instead in form of string. The output is fine. Following snippet demonstrates this :

from transformers import RobertaModel, RobertaTokenizer, RobertaConfig
from transformers import pipeline

MODEL_PATH =  'roberta-base'

model = RobertaModel.from_pretrained(MODEL_PATH)
tokenizer = RobertaTokenizer.from_pretrained(MODEL_PATH)

fill_from_path = pipeline(
    'fill-mask',
    model=MODEL_PATH,
    tokenizer=MODEL_PATH
)

fill_from_model = pipeline(
    'fill-mask',
    model=model,
    tokenizer=tokenizer
)
seq = 'I found a bug in <mask>'
print(fill_from_path(seq))
print(fill_from_model(seq))

The output is the following. You can see the first output is fine where we used the model paths, but the second output where we provided the model and tokenizer classes has a problem.

[{'sequence': '<s> I found a bug in Firefox</s>', 'score': 0.051126863807439804, 'token': 30675}, {'sequence': '<s> I found a bug in Gmail</s>', 'score': 0.027283240109682083, 'token': 29004}, {'sequence': '<s> I found a bug in Photoshop</s>', 'score': 0.024683473631739616, 'token': 35197}, {'sequence': '<s> I found a bug in Java</s>', 'score': 0.021543316543102264, 'token': 24549}, {'sequence': '<s> I found a bug in Windows</s>', 'score': 0.018485287204384804, 'token': 6039}]
[{'sequence': '<s> I found a bug in real</s>', 'score': 0.9705745577812195, 'token': 588}, {'sequence': '<s> I found a bug in here</s>', 'score': 0.00013350950030144304, 'token': 259}, {'sequence': '<s> I found a bug in within</s>', 'score': 6.807789031881839e-05, 'token': 624}, {'sequence': '<s> I found a bug in San</s>', 'score': 6.468965875683352e-05, 'token': 764}, {'sequence': '<s> I found a bug in 2015</s>', 'score': 6.282260437728837e-05, 'token': 570}]
julien-c commented 4 years ago

@ashutosh-dwivedi-e3502 Try changing this line model = RobertaModel.from_pretrained(MODEL_PATH) into model = AutoModelForMaskedLM.from_pretrained(MODEL_PATH)

ashutosh-dwivedi-e3502 commented 4 years ago

@JuhaKiili That fixes it. Output with model = AutoModelForMaskedLM.from_pretrained(MODEL_PATH) is :

Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/Users/asdwivedi/.virtualenvs/test-demo-TklxO9OB/lib/python3.8/site-packages/transformers/modeling_auto.py:796: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
  warnings.warn(
Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[{'sequence': '<s>I found a bug in Firefox</s>', 'score': 0.05709619075059891, 'token': 30675, 'token_str': 'ĠFirefox'}, {'sequence': '<s>I found a bug in Gmail</s>', 'score': 0.03430333733558655, 'token': 29004, 'token_str': 'ĠGmail'}, {'sequence': '<s>I found a bug in WordPress</s>', 'score': 0.028388172388076782, 'token': 33398, 'token_str': 'ĠWordPress'}, {'sequence': '<s>I found a bug in Java</s>', 'score': 0.02571324072778225, 'token': 24549, 'token_str': 'ĠJava'}, {'sequence': '<s>I found a bug in Python</s>', 'score': 0.01953786611557007, 'token': 31886, 'token_str': 'ĠPython'}]
[{'sequence': '<s>I found a bug in Firefox</s>', 'score': 0.05709619075059891, 'token': 30675, 'token_str': 'ĠFirefox'}, {'sequence': '<s>I found a bug in Gmail</s>', 'score': 0.03430333733558655, 'token': 29004, 'token_str': 'ĠGmail'}, {'sequence': '<s>I found a bug in WordPress</s>', 'score': 0.028388172388076782, 'token': 33398, 'token_str': 'ĠWordPress'}, {'sequence': '<s>I found a bug in Java</s>', 'score': 0.02571324072778225, 'token': 24549, 'token_str': 'ĠJava'}, {'sequence': '<s>I found a bug in Python</s>', 'score': 0.01953786611557007, 'token': 31886, 'token_str': 'ĠPython'}]