InvalidArgumentError, Found Inf or NaN global norm.

google-research / bert

TensorFlow code and pre-trained models for BERT

https://arxiv.org/abs/1810.04805

Apache License 2.0

38.08k stars 9.59k forks source link

InvalidArgumentError, Found Inf or NaN global norm. #472

Open CodeXiaoLingYun opened 5 years ago

CodeXiaoLingYun commented 5 years ago

NO,I am seeing the same error. I also used the same function(tf.clip_by_global_norm),but I found learning rate and function are not the key reasons. when i generate Vocab,i set the size is 4682,and the vocab_size is 4682 in train.py,too. as the same,i do not know whether decrease the batch size is useful

In train.py

the error is

CodeXiaoLingYun commented 5 years ago

the model is seq2seq model ！！！！！

CodeXiaoLingYun commented 5 years ago

i found a questionable point is GPU:0. I think it may be related with my GPU,so i try to add the code like this image i do not know whether it is useful,i want to try.