Training using HF Transformers

Hi authors, Thank you sharing this interesting piece of work. I was trying this model on custom NER dataset and compare it with other BERT variants. To that end, I was wondering if you could provide instructions on how to finetune this model on custom NER dataset and what should be the dataset format.

Also, instructions on pre-training the base model (without any head) using unlabled corpus would be really really useful. I saw some instructions around pre-training using allennlp library but there is some friction there. Since HF now is fairly stable library and widely popular, would appreciate if you could provide instructions on using LUKE using HF.

studio-ousia / luke

Training using HF Transformers #156