apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.76k stars 6.8k forks source link

Segmentation fault:11 with Python 2.7.11 OSX #4661

Closed shadowinlife closed 5 years ago

shadowinlife commented 7 years ago

For bugs or installation issues, please provide the following information. The more information you provide, the more likely people will be able to help you.

Environment info

Operating System: OS X 10.11.6

Compiler: Default Compile Turtoris

Package used (Python/R/Scala/Julia): Python MXNet version installed from source: 0.9.1 MXNet commit hash (git rev-parse HEAD): 5656c8601265a437c1cb7ea18a6f1661f346c8b5

Python version and distribution: 2.7.11

Error Message:

Please paste the full error message, including stack trace.

System Integrity Protection: enabled

Crashed Thread:        8

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x000000047d746000

VM Regions Near 0x47d746000:
    VM_ALLOCATE            000000047b746000-000000047d746000 [ 32.0M] rw-/rwx SM=PRV  
--> VM_ALLOCATE            000000047d746000-000000047d747000 [    4K] rw-/rwx SM=ALI  
    STACK GUARD            0000700000000000-0000700000001000 [    4K] ---/rwx SM=NUL  stack guard for thread 1

Thread 0:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff8edf9db6 __psynch_cvwait + 10
1   libsystem_pthread.dylib         0x00007fff8f04b728 _pthread_cond_wait + 767
2   libc++.1.dylib                  0x00007fff951af68f std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 47
3   libmxnet.so                     0x0000000110e6a7d3 mxnet::engine::ThreadedEngine::WaitForVar(mxnet::engine::Var*) + 563
4   libmxnet.so                     0x0000000110ed86e7 mxnet::NDArray::SyncCopyToCPU(void*, unsigned long) const + 1111
5   libmxnet.so                     0x0000000110e3769d MXNDArraySyncCopyToCPU + 13
6   _ctypes.so                      0x00000001100bf7f7 ffi_call_unix64 + 79
7   _ctypes.so                      0x00000001100c0030 ffi_call + 840
8   _ctypes.so                      0x00000001100bb62a _ctypes_callproc + 591
9   _ctypes.so                      0x00000001100b5cad PyCFuncPtr_call + 1054
10  org.python.python               0x000000010f91c9b4 PyObject_Call + 99
11  org.python.python               0x000000010f99be30 PyEval_EvalFrameEx + 26668
12  org.python.python               0x000000010f99540e PyEval_EvalCodeEx + 1617
13  org.python.python               0x000000010f99fc9e fast_function + 117
14  org.python.python               0x000000010f99bf09 PyEval_EvalFrameEx + 26885
15  org.python.python               0x000000010f99fd31 fast_function + 264
16  org.python.python               0x000000010f99bf09 PyEval_EvalFrameEx + 26885
17  org.python.python               0x000000010f99540e PyEval_EvalCodeEx + 1617
18  org.python.python               0x000000010f994db7 PyEval_EvalCode + 48
19  org.python.python               0x000000010f9b883c run_mod + 53
20  org.python.python               0x000000010f9b88df PyRun_FileExFlags + 133
21  org.python.python               0x000000010f9b8430 PyRun_SimpleFileExFlags + 702
22  org.python.python               0x000000010f9c9d9e Py_Main + 3094
23  libdyld.dylib                   0x00007fff932965ad start + 1

Minimum reproducible example

No just run the example in the source code

Steps to reproduce

  1. Install mxnet with setup in the official document: http://mxnet.io/get_started/osx_setup.html#install-mxnet-for-python

  2. Download image_segementation pre trained model from baidu.yun

    FCN8S_VGG16_0019

  3. Run example image_segmentation.py

    Segementation fault: 11

piiswrong commented 7 years ago

I get this error: [23:57:33] src/nnvm/legacy_json_util.cc:153: Loading symbol saved by previous version v0.8.0. Attempting to upgrade... [23:57:51] /Users/junyuanx/work/local/mxnet/dmlc-core/include/dmlc/./logging.h:300: [23:57:51] src/operator/./crop-inl.h:117: Check failed: param_.offset[0] <= data_shape[2]-out_shape[2] (34 vs. 33) offset[0] should be less than the residual space of height

seems to be the padding shape issue. @tornadomeet could you update it?

piiswrong commented 7 years ago

@howard0su

shadowinlife commented 7 years ago

Thanks, Where Can I download compiled 0.8.0 mxnet, and have test on this pre-trained model?

yajiedesign commented 6 years ago

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

huyangc commented 6 years ago

Same error here.


I fixed it by change the crop offset of the last crop layer from 34 to 31 (according to the author's caffe github). I am now a little confused with the crop offset.

szha commented 6 years ago

@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.

For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

vandanavk commented 5 years ago

Issue not reproducible with MXNet 1.3.1, Python 2.7 and Python 3.6. Verified on DL AMI with Ubuntu.

@shadowinlife Can this issue be closed?

cc @sandeep-krishnamurthy

zhreshold commented 5 years ago

@vandanavk The original issue started with "Operating System: OS X 10.11.6". But you are testing on ubuntu.

vandanavk commented 5 years ago

Update: Tested on MacOS 10.12.6, MXNet 1.3.1, Python 2.7.15 too. Segmentation fault not observed.

Executed python2 image_segmentaion.py --input VOC2012/JPEGImages/2007_000027.jpg

@zhreshold thanks :) updated

piyushghai commented 5 years ago

@sandeep-krishnamurthy Requesting this issue to be closed as the Segmentation Fault is no longer observed.

@shadowinlife Please feel free to reopen if closed in error. Thanks!