iPieter / RobBERT

A Dutch RoBERTa-based language model
https://pieter.ai/robbert/
MIT License
196 stars 29 forks source link

Download link Fairseq v2.0 model is not provided #20

Closed sdblanc closed 3 years ago

sdblanc commented 3 years ago

Lately , I have been experimenting with the masked LM head of the RobBERT model from fairseq (since the huggingfacetransformer wasn't available at the time). And I had been getting some unexpected results, such as:

model = RobertaModel.from_pretrained('../../models/robbert/',checkpoint_file='RobBERT-base.pt')
model.eval() 

text = '<mask> is de hoofdstad van België.'
print(model.fill_mask(text, 4))

Resulting in: [(0.23022648692131042, 'Canada'), (0.11474426090717316, 'France'), (0.08297888934612274, 'Paris'), (0.07531193643808365, 'Dat')]

Another strange example:

text = 'Ik heb zin in <mask> met frietjes.'
print(model.fill_mask(text, no_results))

Resulting in [( 0.15896572172641754, ' brood'), ( 0.11806301772594452, ' chips'), (0.08460015058517456, ' pasta'), ( 0.071708545088768, ' spaghetti')]

However, when I tried out these examples using the Huggingface transformer implementation , I get different (and better) results:

[{'sequence': '<s>Belgiëis de hoofdstad van België.</s>',
  'score': 0.2881818115711212},
 {'sequence': '<s>Brusselis de hoofdstad van België.</s>',
  'score': 0.1142464280128479},
 {'sequence': '<s>Vlaanderenis de hoofdstad van België.</s>',
  'score': 0.09562666714191437},
 {'sequence': '<s>Antwerpenis de hoofdstad van België.</s>',
  'score': 0.06401436030864716},
 {'sequence': '<s>Bis de hoofdstad van België.</s>',
  'score': 0.040388405323028564}]

and

[{'sequence': '<s>Ik heb zin in/met frietjes.</s>',
  'score': 0.26582473516464233},
 {'sequence': '<s>Ik heb zin in...met frietjes.</s>',
  'score': 0.1382495015859604},
 {'sequence': '<s>Ik heb zin in frietjesmet frietjes.</s>',
  'score': 0.1260228306055069},
 {'sequence': '<s>Ik heb zin in kipmet frietjes.</s>',
  'score': 0.043293338268995285},
 {'sequence': '<s>Ik heb zin in frietmet frietjes.</s>',
  'score': 0.03967735171318054}]

When analysing this difference in behaviour, I saw that the link to download the fairseq model still seems to refer to version 1.0. of the model : https://github.com/iPieter/BERDT/releases/download/v1.0/RobBERT-base.pt . Is this plausible?

iPieter commented 3 years ago

Great to see you looking into RobBERT!

It seems like you are using the v1 model. The latest fairseq model is available as a release on Github. I will check if any links are still referring to the old model, thank you for highlighting this!