apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Backwards compatibility out of bounds for 1.2.1 #14234

Open marcoabreu opened 5 years ago

marcoabreu commented 5 years ago

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-backwards-compatibility-checker/detail/restricted-backwards-compatibility-checker/380/pipeline

INFO:root:=================================

INFO:root:Fetching files for MXNet version : 1.2.1 and model lenet_gluon_hybrid_export_api

[06:36:44] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v1.2.1. Attempting to upgrade...

[06:36:44] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!

Traceback (most recent call last):

  File "model_backwards_compat_inference.py", line 136, in <module>

    test_lenet_gluon_hybrid_imports_api()

  File "model_backwards_compat_inference.py", line 95, in test_lenet_gluon_hybrid_imports_api

    assert_almost_equal(old_inference_results.asnumpy(), output.asnumpy(), rtol=rtol_default, atol=atol_default)

  File "/work/mxnet/python/mxnet/test_utils.py", line 495, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1962.383789 exceeds tolerance rtol=0.000010, atol=0.000010.  Location of maximum error:(15, 0), a=0.061246, b=0.040821

 a: array([[ 0.03364218,  0.24863665],

       [-0.03897328,  0.28473783],

       [ 0.00993963,  0.18869999],...

 b: array([[ 0.03364221,  0.24863653],

       [-0.03897329,  0.28473788],

       [ 0.00993963,  0.18870012],...
mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Bug

frankfliu commented 5 years ago

@mxnet-label-bot add [Bug]

piyushghai commented 5 years ago

@marcoabreu Haven't seen this failing again. Maybe something intermittently went in that broke it and now it seems to be fixed. http://jenkins.mxnet-ci.amazon-ml.com/job/restricted-backwards-compatibility-checker/

Can this issue be closed ?

marcoabreu commented 5 years ago

No, MXNet yields different results for the same input, so we should track down why.

vandanavk commented 5 years ago

@mxnet-label-bot add [CI]

lebeg commented 5 years ago

Another failure:

http://jenkins.mxnet-ci.amazon-ml.com/job/restricted-backwards-compatibility-checker/420/console

INFO:root:Fetching files for MXNet version : 1.2.1 and model lenet_gluon_save_params_api
Traceback (most recent call last):
  File "model_backwards_compat_inference.py", line 135, in <module>
    test_lenet_gluon_load_params_api()
  File "model_backwards_compat_inference.py", line 72, in test_lenet_gluon_load_params_api
    assert_almost_equal(old_inference_results.asnumpy(), output.asnumpy(), rtol=rtol_default, atol=atol_default)
  File "/work/mxnet/python/mxnet/test_utils.py", line 495, in assert_almost_equal
    raise AssertionError(msg)
AssertionError: 
Items are not equal:
Error 9591.478516 exceeds tolerance rtol=0.000010, atol=0.000010.  Location of maximum error:(17, 0), a=-0.441335, b=-0.315189
 a: array([[-0.40397233, -0.19248717],
       [-0.34466907, -0.15791757],
       [-0.39881065, -0.2201823 ],...
 b: array([[-0.4039724 , -0.19248715],
       [-0.3446689 , -0.15791774],
       [-0.39881057, -0.22018239],...

Maybe this should be tracked separately as it's a different failure?

lebeg commented 5 years ago

Created separate issue https://github.com/apache/incubator-mxnet/issues/14524

lebeg commented 5 years ago

Probably this issue can be closed.

vdantu commented 5 years ago

@mxnet-label-bot update [flaky, gluon, ci]