Closed timespaceuniverse closed 4 years ago
@sxjscience is working on it
@lilongyue You may try with the new version that is initiated in https://github.com/dmlc/gluon-nlp/pull/1225 . We are implementing the pretraining part and will PR once ready.
Description
albert model size is pretty small and interesting. A mxnet based pretraining implementation should be helpful to alot of people!
References