cl-tohoku / bert-japanese

BERT models for Japanese text.
Apache License 2.0
514 stars 55 forks source link

BertJapaneseTokenizer can find 'cl-tohoku/bert-base-japanese-whole-word-masking' but BertModel cannot ('cl-tohoku/bert-base-japanese-whole-word-masking') #18

Closed wailoktam closed 3 years ago

wailoktam commented 4 years ago

During preprocessing, the following line has no problem.

    self.tokenizer = BertJapaneseTokenizer.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')

However, during training, I get the following error

Model name 'cl-tohoku/bert-base-japanese-whole-word-masking' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc).

from

BertModel.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')

Any idea?

In both case, I install pytorch-transformers with pip. Thanks in advance for your help.

wailoktam commented 4 years ago

I try importing from BertForMaskedLM

self.model = BertForMaskedLM.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')

The model can be found but I get this error:

--> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(),

TypeError: forward() got multiple values for argument 'attention_mask'

singletongue commented 4 years ago

Could you please check if your installation of transformers is up to date?

If so, would you please clear the cache of downloaded tokenizers and models and try again? It should be stored in ~/.cache/torch/transformers.

wailoktam commented 4 years ago

I solve it myself. It is due to moving from pytorch_transformers to transformers. The way arguments are passed to Bert models are different.