[CI][Flaky test][nightly] flaky test in estimator nightly test

This test_sentiment_rnn.py seems to be flaky. see pipeline. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/342/pipeline

It's been modified from https://github.com/d2l-ai/d2l-en/blob/master/chapter_natural-language-processing/sentiment-analysis-rnn.md Dataset: IMDB Optimizer: Adam Learning rate: 0.01 Initializer: Xavier Platform: Ubuntu GPU (AWS p3.2xlarge)

I was able to reproduce the error by running this test 1000 times locally. This accuracy seems abnormal, but other runs seems fine. Accuracy increased from 0.8 to 0.9 in 5 epochs.

[Epoch 0] Finished in 61.230s, train accuracy: 0.7166, train softmaxcrossentropyloss: 0.5390, validation accuracy: 0.8188, validation softmaxcrossentropyloss: 0.4100

[Epoch 1] Finished in 60.762s, train accuracy: 0.5248, train softmaxcrossentropyloss: 0.6945, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7103

[Epoch 2] Finished in 61.041s, train accuracy: 0.5001, train softmaxcrossentropyloss: 0.7150, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7381

[Epoch 3] Finished in 60.868s, train accuracy: 0.5078, train softmaxcrossentropyloss: 0.7176, validation accuracy: 0.5044, validation softmaxcrossentropyloss: 0.6928

[Epoch 4] Finished in 60.850s, train accuracy: 0.5050, train softmaxcrossentropyloss: 0.7097, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7254

Traceback (most recent call last):

  File "test_sentiment_rnn.py", line 287, in <module>

    test_estimator_gpu(**kwargs)

  File "test_sentiment_rnn.py", line 268, in test_estimator_gpu

    assert acc.get()[1] > 0.70

AssertionError

apache / mxnet

[CI][Flaky test][nightly] flaky test in estimator nightly test #15199