apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.73k stars 6.81k forks source link

test_numpy_op.py:test_np_unary_bool_funcs hangs #18149

Open nickguletskii opened 4 years ago

nickguletskii commented 4 years ago

Description

It seems that test_np_unary_bool_funcs:test_np_unary_bool_funcs hangs in CI for hours, then the job gets aborted. At least, I think it's test_np_unary_bool_funcs, because test_np_true_divide usually comes before and no output for test_np_unary_bool_funcs is displayed in the logs when the test suite becomes unresponsive.

Occurrences

  1. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-18142/1/pipeline/431 (Python3: GPU TVM_OP OFF)
  2. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-17177/9/pipeline/417 (Python3: GPU)
  3. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-17177/12/pipeline/431 (Python3: GPU TVM_OP OFF)
  4. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-18067/3/pipeline/417 (Python3: GPU)
  5. Probably many more.

What have you tried to solve it?

1. 2.

leezu commented 4 years ago

Related https://github.com/apache/incubator-mxnet/issues/18090