Open eric-haibin-lin opened 5 years ago
Hi Eric,
I am trying to train a classifier based on the fine-tuning tutorial with Bert. I am currently using a p3.2xlarge EC2 GPU instance with the following characteristics:
@eric-haibin-lin
Hi @rfigueror1
Sorry for the late reply. I was busy working on some deadline.
Did you run the tutorial as is, or make some modification (e.g. sequence length) for your own dataset? What version of MXNet did you use? You can get that via pip list | grep mxnet
I'll also try to reproduce your setting and get back to you.
Hi Eric,
Thanks for the reply, I already solved the issue by decreasing my batch size.
Best
Hello all,
We just released a MXNet/Gluon port of BERT in GluonNLP v0.5!
We converted the pre-trained models in TF and it generates the same output as the TF implementation. The BERT model and vocabulary can be downloaded automatically with the
get_model()
API. Besides we also added a tutorial covering how to fine-tuning with BERT step by step. We have ongoing work of training BERT with multi-GPU and gradient accumulation, and plan to include the recently released BERT models and apply it on other downstream tasks, too.Here is the link: http://gluon-nlp.mxnet.io/model_zoo/bert/index.html
Thank you @jacobdevlin-google so much for releasing the pre-trained BERT model and code to the research community.
Haibin - GluonNLP team