Closed davidhewitt closed 4 years ago
@davidhewitt can you follow https://mxnet.apache.org/get_started/build_from_source and verify if the same warnings apply to the libmxnet.so
generated when following the guide?
When following the guide, I was able to build a libmxnet.so
which did not have these warnings. Using Ubuntu 20.04, WSL2. I had to disable CUDA support.
Thank you for trying that @davidhewitt. The difference between the libmxnet.so in the pip wheel and the one obtained via the build_from_source guide is that in the pip wheel a number of dependencies are statically linked to enable portability across various linux distributions and versions thereof. It's built on a CentOS 7 host for compliance with https://www.python.org/dev/peps/pep-0599/
Downgrading to a slightly older version of mxnet seems to work:
$ pip install mxnet==1.6.0b20200127
The libmxnet.so file in 1.6.0 is very broken. Not sure why the python3.8 interpreter doesn't crash. It also calls Py_FinalizeEx.
https://github.com/PyO3/pyo3/issues/1044#issuecomment-660469745
@m-ou-se can you provide more details on why the file is broken? Specifically, I'm not sure if the issue at hand is really a build issue or rather a bug in the MXNet backend. Does your statement apply to the latest versions at https://dist.mxnet.io/python/cpu , for example https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200720-py2.py3-none-manylinux2014_x86_64.whl
cc @eric-haibin-lin as the issue may be related engine shutdown
Debugging leads us to this line: https://github.com/apache/incubator-mxnet/blob/a0e67353fe81ed97fc7aef2d8429a93dc035a394/src/c_api/c_api.cc#L1318.
Running import mxnet directly in python (and exiting the interpreter) exits normally.
Strangely, on Python 3.8 we get a corrupted double-linked list instead.
I realize that mxnet is absolutely huge, but thought you guys might be interested in knowing about a specific edge-case which causes problems for pyo3. Of course, if you guys are able to provide a bit of insight that'd be super helpful. :) Thank you!
The crash is fixed by https://github.com/apache/incubator-mxnet/pull/18768
I'm not sure about the readelf
warnings, but currently I don't see any evidence for them being harmful (and we're just using cmake to statically link in a bunch of libraries into the libmxnet.so
). So I'll close this issue. Please comment and reopen if any action needs to be taken from the build perspective.
Description
I'm attempting to debug a crash observed by a user of PyO3 (https://github.com/PyO3/pyo3/issues/1044) which occurs when
mxnet
is imported.Attempting to use
gdb
(viarust-gdb
wrapper script) suggests thatmxnet.so
is partially corrupted.readelf -a
also emits some warnings. Both are pasted below.Error Message
Errors seen from
readelf -a path/to/libmxnet.so | grep -i warning
:Errors seen from
gdb
:To Reproduce
Run
readelf -a path/to/libmxnet.so | grep -i warning
.Alternatively request and I can write tutorial how to install & run the linked Rust code under
rust-gdb
.Environment
I'm using Ubuntu 20.04 on WSL2. According to pip,
mxnet
was installed via the wheelmxnet-1.6.0-py2.py3-none-any.whl
.