apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

horovod cpu build segfault #18013

Open eric-haibin-lin opened 4 years ago

eric-haibin-lin commented 4 years ago

This works:

pip3 install https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200306-py2.py3-none-manylinux1_x86_64.whl

pip3 uninstall horovod -y; pip3 install horovod --user --no-cache-dir
horovodrun -np 4 python3.6 test.py

test.py:

import horovod.mxnet as hvd
import mxnet as mx

hvd.init()
a = mx.nd.ones((1))
hvd.allreduce_(a)
print(a)

This fails:

pip3 install https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl
eric-haibin-lin commented 4 years ago

@apeforest FYI

eric-haibin-lin commented 4 years ago

@TaoLv @PatricZhao FYI

TaoLv commented 4 years ago

@eric-haibin-lin @apeforest I cannot install mxnet and horovod through the command lines shared in the description. Besides, with pip install --pre mxnet -f https://dist.mxnet.io/python/cpu, I can only get https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200312-py2.py3-none-manylinux1_x86_64.whl.

eric-haibin-lin commented 4 years ago

@TaoLv can you try wget https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl and then pip install mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl

TaoLv commented 4 years ago

@TaoLv can you try wget https://repo.mxnet.io/dist/python/cpu/mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl and then pip install mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl

Got:

(mxnet) [lvtao@mlt2-clx103 ~]$ pip install mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl
ERROR: mxnet-2.0.0b20200313-py2.py3-none-manylinux2014_x86_64.whl is not a supported wheel on this platform.

I'm using CentOS 7.2 and python 3.6.