Closed abhisheksgumadi closed 4 years ago
Ok, I confirmed it does :) The code uses nn.DataParallel which is cool.
Can you confirm for it to run using nn.DistributedDataParallel, is there any other change that is needed?
Actually it was made before HF repo :)
a) basically yes, but you have to make your own fine-tuning code like classify.py. I rarely manage this repo nowadays because HF code is a de-facto standard.
b) This code use multi-GPU at default via DataParallel
Hello,
Thanks for the excellent work in compressing the HF code in a single repo for BERT.
Just a couple fo questions:
a) Is it possible to load pertained BERT weights and then fine tune on top of it on my own dataset? b) Does this support multi GPU training?
Thanks Abhishek