apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.79k forks source link

Error Running GluonCV Transfer Learning Tutorial on Mac OSX #16956

Open zhanghang1989 opened 4 years ago

zhanghang1989 commented 4 years ago

Description

Error Running GluonCV Transfer Learning Tutorial on Mac OSX. https://gluon-cv.mxnet.io/build/examples_classification/transfer_learning_minc.html

Error Message

(Paste the complete error message, including stack trace.)

Segmentation fault: 11

Stack trace:
  [bt] (0) 1   libmxnet.so                         0x000000011b29a2b0 mxnet::Storage::Get() + 4880
  [bt] (1) 2   libsystem_platform.dylib            0x00007fff70db4f5a _sigtramp + 26
  [bt] (2) 3   ???                                 0x0000000100007ffe 0x0 + 4295000062
  [bt] (3) 4   libmxnet.so                         0x000000011b5d5751 mxnet::Storage::Get() + 3393457
  [bt] (4) 5   libmxnet.so                         0x000000011b472dd5 mxnet::Storage::Get() + 1941045
  [bt] (5) 6   libmxnet.so                         0x000000011b473dba mxnet::Storage::Get() + 1945114
  [bt] (6) 7   libmxnet.so                         0x000000011b45a99f mxnet::Storage::Get() + 1841663
  [bt] (7) 8   libmxnet.so                         0x000000011abfae61 mxnet::io::ImdecodeImpl(int, bool, void*, unsigned long, mxnet::NDArray*) + 3073
  [bt] (8) 9   libmxnet.so                         0x000000011ab586c7 std::__1::enable_if<(__is_forward_iterator<mxnet::NDArray**>::value) && (is_constructible<mxnet::NDArray*, std::__1::iterator_traits<mxnet::NDArray**>::reference>::value), void>::type std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> >::assign<mxnet::NDArray**>(mxnet::NDArray**, mxnet::NDArray**) + 30295

Segmentation fault: 11

Stack trace:
  [bt] (0) 1   libmxnet.so                         0x000000011b29a2b0 mxnet::Storage::Get() + 4880
  [bt] (1) 2   libsystem_platform.dylib            0x00007fff70db4f5a _sigtramp + 26
  [bt] (2) 3   ???                                 0x00000ff200000ff1 0x0 + 17532056506353
  [bt] (3) 4   libmxnet.so                         0x000000011b5d5751 mxnet::Storage::Get() + 3393457
  [bt] (4) 5   libmxnet.so                         0x000000011b472dd5 mxnet::Storage::Get() + 1941045
  [bt] (5) 6   libmxnet.so                         0x000000011b473dba mxnet::Storage::Get() + 1945114
  [bt] (6) 7   libmxnet.so                         0x000000011b45a99f mxnet::Storage::Get() + 1841663
  [bt] (7) 8   libmxnet.so                         0x000000011abfae61 mxnet::io::ImdecodeImpl(int, bool, void*, unsigned long, mxnet::NDArray*) + 3073
  [bt] (8) 9   libmxnet.so                         0x000000011ab586c7 std::__1::enable_if<(__is_forward_iterator<mxnet::NDArray**>::value) && (is_constructible<mxnet::NDArray*, std::__1::iterator_traits<mxnet::NDArray**>::reference>::value), void>::type std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> >::assign<mxnet::NDArray**>(mxnet::NDArray**, mxnet::NDArray**) + 30295
libc++abi.dylib: terminating

Segmentation fault: 11

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. install gluoncv pip install gluoncv
  2. Run the tutorial https://gluon-cv.mxnet.io/build/examples_classification/transfer_learning_minc.html

What have you tried to solve it?

  1. Different MXNet Versions

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

# paste outputs here
----------Python Info----------
Version      : 3.7.5
Compiler     : Clang 4.0.1 (tags/RELEASE_401/final)
Build        : ('default', 'Oct 25 2019 10:52:18')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 19.3.1
Directory    : /Users/hzaws/anaconda3/envs/autogluon/lib/python3.7/site-packages/pip
----------MXNet Info-----------
Version      : 1.6.0
Directory    : /Users/hzaws/anaconda3/envs/autogluon/lib/python3.7/site-packages/mxnet
Num GPUs     : 0
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform     : Darwin-17.7.0-x86_64-i386-64bit
system       : Darwin
node         : 88e9fe5636b1.ant.amazon.com
release      : 17.7.0
version      : Darwin Kernel Version 17.7.0: Fri Oct  4 23:08:59 PDT 2019; root:xnu-4570.71.57~1/RELEASE_X86_64
----------Hardware Info----------
machine      : x86_64
processor    : i386
b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz'
b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
b'machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 HLE AVX2 SMEP BMI2 ERMS INVPCID RTM FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT MD_CLEAR TSXFA IBRS STIBP L1DF SSBD'
b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0339 sec, LOAD: 0.6320 sec.
Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0007 sec, LOAD: 0.5793 sec.
Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0010 sec, LOAD: 0.0588 sec.
Timing for D2L: http://d2l.ai, DNS: 0.0010 sec, LOAD: 0.0565 sec.
Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0010 sec, LOAD: 0.0564 sec.
Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0010 sec, LOAD: 0.3162 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0327 sec, LOAD: 0.5569 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0006 sec, LOAD: 0.0813 sec.
zhanghang1989 commented 4 years ago

Related GluonCV issue https://github.com/dmlc/gluon-cv/issues/1069

zhanghang1989 commented 4 years ago

Related isssue https://github.com/apache/incubator-mxnet/issues/16051#issue-487660150

samskalicky commented 4 years ago

@apeforest assign @Jerryzcn