Closed davisliang closed 4 years ago
Doing a forward pass with BERT-base (using the same parameters) on fp16 gives very different results from fp32. This is the case for GluonNLP beyond 0.6.0.
from float32:
[[[-0.14241205 0.13353725 -0.12907065 ... -0.35967964 -0.05622258 0.36050138] [-0.350648 0.10419771 0.6244457 ... -0.17610289 0.48340237 0.06443504] [-0.24513094 -0.15731683 0.69451946 ... -0.5654467 -0.0894002 -0.18564378] [-0.82478666 -0.9119223 -0.65607107 ... 0.50742483 -0.19388783 -0.16587636] [ 0.8766523 0.03524842 -0.12331446 ... 0.2720161 -0.6369005 -0.1585012 ]]] <NDArray 1x5x768 @gpu(0)>
from float16:
[[[-0.4473 0.03326 -0.06555 ... -0.4893 -0.1052 0.5503 ] [-0.9287 -0.04443 0.9863 ... -0.7188 -0.1516 0.0721 ] [-0.6553 -0.2798 0.6636 ... -0.526 -0.5 0.03748] [-0.726 -0.81 -0.05014 ... 0.2372 -0.447 0.04047] [-1.035 -0.578 0.5273 ... -0.4065 -0.3872 0.5005 ]]] <NDArray 1x5x768 @gpu(0)>
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
(Paste the commands you ran that produced the error.)
#float32 import gluonnlp as nlp; import mxnet as mx; model, vocab = nlp.model.get_model('bert_12_768_12', dataset_name='book_corpus_wiki_en_uncased', use_classifier=False, use_decoder=False, ctx=mx.gpu(0)); tokenizer = nlp.data.BERTTokenizer(vocab, lower=True); transform = nlp.data.BERTSentenceTransform(tokenizer, max_seq_length=512, pair=False, pad=False); sample = transform(['Hello world!']); model.cast('float32') words, valid_len, segments = mx.nd.array([sample[0]]).as_in_context(mx.gpu(0)), \ mx.nd.array([sample[1]]).as_in_context(mx.gpu(0)).astype('float32'), \ mx.nd.array([sample[2]]).as_in_context(mx.gpu(0)).astype('float32') seq_encoding, cls_encoding = model(words, segments, valid_len);
# float16 import gluonnlp as nlp; import mxnet as mx; model, vocab = nlp.model.get_model('bert_12_768_12', dataset_name='book_corpus_wiki_en_uncased', use_classifier=False, use_decoder=False, ctx=mx.gpu(0)); tokenizer = nlp.data.BERTTokenizer(vocab, lower=True); transform = nlp.data.BERTSentenceTransform(tokenizer, max_seq_length=512, pair=False, pad=False); sample = transform(['Hello world!']); model.cast('float16') words, valid_len, segments = mx.nd.array([sample[0]]).as_in_context(mx.gpu(0)), \ mx.nd.array([sample[1]]).as_in_context(mx.gpu(0)).astype('float16'), \ mx.nd.array([sample[2]]).as_in_context(mx.gpu(0)).astype('float16') seq_encoding, cls_encoding = model(words, segments, valid_len);
Just your average EC2 machine with pip install mxnet-cu102
pip install mxnet-cu102
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python # paste outputs here
Did you try to turn on safe accumulation = 1?
Thanks, Haibin, this solved the issue! Will add a comment to RFC to enable this on default!
Description
Doing a forward pass with BERT-base (using the same parameters) on fp16 gives very different results from fp32. This is the case for GluonNLP beyond 0.6.0.
Error Message
from float32:
from float16:
To Reproduce
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Environment
Just your average EC2 machine with
pip install mxnet-cu102
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below: