I fine tuned a german uncased BERT model ("dbmdz/bert-base-german-uncased") for NER tasks using Germeval2014+some custom examples and converted it to Huggingface using this example.
After conversion the prediction and tokenization is different than before when using the Farm Inferencer.
Example
Sentence: "Ich heiße Peter und wohne in Wilhelmshaven".
Question
Hi,
I fine tuned a german uncased BERT model ("dbmdz/bert-base-german-uncased") for NER tasks using Germeval2014+some custom examples and converted it to Huggingface using this example. After conversion the prediction and tokenization is different than before when using the Farm Inferencer.
Example Sentence: "Ich heiße Peter und wohne in Wilhelmshaven".
Prediction with FARM Inferencer
Prediction with converted model and Huggingface pipeline
nlp = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)
What could cause auch a problem? It looks like a wrong WordPiece tokenization (e.g. Peter ##e) and wrong classification.
Thanks in advance for any help Markus
Background: I would like to use Huggingface over Farm Inferencer as the prediction on CPU is faster (0.5s to 0.05s on my i7)