Open Vikas-kum opened 5 years ago
Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended label(s): Test
@mxnet-label-bot [Test]
profiler fix should be in this pr: https://github.com/apache/incubator-mxnet/pull/16160
But yeah we can disable until after the fixes for mkldnn_quantization and gluon_performance are done. I think we may have to whitelist mkldnn_quantization from the test suite actually in the long run.
@sad- Thanks. I tried that but doesn't look like profiler tests were passing-
Looks like there is more to fix here - New error that came here was -
MXNetError: [19:29:32] /work/mxnet/3rdparty/dmlc-core/include/dmlc/thread_group.h:227: Check failed: auto_remove_ == false (1 vs. 0) :
Stack trace:
[bt] (0) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7effabb7aed2]
[bt] (1) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::ThreadGroup::Thread::joinable() const+0xf4) [0x7effae507734]
[bt] (2) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::profiler::Profiler::SetContinuousProfileDump(bool, float)+0x108) [0x7effae504d28]
[bt] (3) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::profiler::Profiler::SetConfig(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, float, bool)+0x95) [0x7effae505b95]
[bt] (4) /work/mxnet/python/mxnet/../../lib/libmxnet.so(MXSetProcessProfilerConfig+0x3d6) [0x7effaecb80a6]
[bt] (5) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7efffb8bae20]
[bt] (6) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7efffb8ba88b]
[bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7efffb8b501a]
[bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(+0x9fcb) [0x7efffb8a8fcb]
Logs for reference - http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/tutorials_nighly_fix/10/pipeline
Can you please try to provide fixes for 3 tests and then we can enable the tests in nightly. MKLDNN is mostly using wrong binary. (Currently using GPU binary without mkldnn libraries.)
Nighltly tests are failing due to some tutorial tests. We fixed some. 3 tests were disabled form this file : tests/tutorials/test_tutorials.py test_gluon_performance test_python_profiler test_mkldnn_quantization
We need to uncomment the test after fixes are done in tutorials.
https://github.com/apache/incubator-mxnet/pull/16179/files