dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

difference between gluonnlp 0.10.0 and gluonnlp 1.0.0 RoBERTaModel? #1563

Closed makua-bernal closed 2 years ago

makua-bernal commented 3 years ago

I'm working on converting a RoBERTa model to gluonnlp 0.10.0 with mxnet 1.7.0.

I managed to get it working in gluonnlp 1.0.0 and mxnet 2.0.0 and the activations in the hidden layers are the same as the source model, but in gluonnlp 0.10.0 and mxnet 1.7.0 they differ very slightly.

The discrepancy starts in the first layer so I'm assuming it has something to do with the embeddings.

I could have made a mistake somewhere, but I'm wondering if there's a simpler explanation.