apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

[CI test failure] test_fast_lars fails on windows gpu #16566

Open aaronmarkham opened 4 years ago

aaronmarkham commented 4 years ago

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-gpu/detail/PR-16550/3/pipeline/

At C:\jenkins_slave\workspace\ut-python-gpu\ci\windows\test_py2_gpu.ps1:29

ptrendx commented 4 years ago

@Caenorst

haojin2 commented 4 years ago

Happening again at http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-gpu/detail/PR-15906/6/pipeline

======================================================================

FAIL: test_operator_gpu.test_fast_lars

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Python27\lib\site-packages\nose\case.py", line 197, in runTest

    self.test(*self.arg)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\../unittest\common.py", line 177, in test_new

    orig_test(*args, **kwargs)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\test_operator_gpu.py", line 329, in test_fast_lars

    check_fast_lars(w_dtype, g_dtype, shapes, ctx, tol1, tol2)

  File "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\test_operator_gpu.py", line 311, in check_fast_lars

    assert_almost_equal(ref_new_lrs.asnumpy(), mx_new_lrs.asnumpy(), atol=tol2, rtol=tol2)

  File "C:\jenkins_slave\workspace\ut-python-gpu\windows_package\python\mxnet\test_utils.py", line 627, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1.279773 exceeds tolerance rtol=1.000000e-06, atol=1.000000e-06 (mismatch 1.851852%).

Location of maximum error: (14,), a=0.01080851, b=0.01080980

 ACTUAL: array([0.00030465, 0.0004317 , 0.00063919, ..., 0.00057   , 0.00057802,

       0.00037339], dtype=float32)

 DESIRED: array([0.00030465, 0.0004317 , 0.00063919, ..., 0.00057   , 0.00057802,

       0.00037339], dtype=float32)

-------------------- >> begin captured stdout << ---------------------

*** Maximum errors for vector of size 54:  rtol=1e-06, atol=1e-06

  1: Error 1.279773  Location of error: (14,), a=0.01080851, b=0.01080980

--------------------- >> end captured stdout << ----------------------

-------------------- >> begin captured logging << --------------------

common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=374849959 to reproduce.

--------------------- >> end captured logging << ---------------------