dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

Build horovod with gloo #1383

Closed leezu closed 3 years ago

leezu commented 3 years ago

Description

Workaround installation issue of Horovod on CI after upstream change of build system:

[2020-10-05T21:42:25.859Z]     gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -m64 -fPIC -fPIC -std=c++11 -fPIC -O3 -Wall -fassociative-math -ffast-math -ftree-vectorize -funsafe-math-optimizations -mf16c -mavx -mfma -I/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/include/python3.5m -c build/temp.linux-x86_64-3.5/test_compile/test_cpp_flags.cc -o build/temp.linux-x86_64-3.5/test_compile/test_cpp_flags.o

[2020-10-05T21:42:25.859Z]     cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++

[2020-10-05T21:42:25.859Z]     gcc -pthread -shared -L/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib -Wl,-rpath=/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib,--no-as-needed build/temp.linux-x86_64-3.5/test_compile/test_cpp_flags.o -L/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib -o build/temp.linux-x86_64-3.5/test_compile/test_cpp_flags.so

[2020-10-05T21:42:25.859Z]     gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -m64 -fPIC -fPIC -I/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/include/python3.5m -c build/temp.linux-x86_64-3.5/test_compile/test_link_flags.cc -o build/temp.linux-x86_64-3.5/test_compile/test_link_flags.o

[2020-10-05T21:42:25.859Z]     cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++

[2020-10-05T21:42:25.859Z]     gcc -pthread -shared -L/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib -Wl,-rpath=/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib,--no-as-needed -Wl,--version-script=horovod.lds build/temp.linux-x86_64-3.5/test_compile/test_link_flags.o -L/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib -o build/temp.linux-x86_64-3.5/test_compile/test_link_flags.so

[2020-10-05T21:42:25.859Z]     Traceback (most recent call last):

[2020-10-05T21:42:25.859Z]       File "/tmp/pip-install-54uenm_u/horovod/setup.py", line 341, in get_mpi_flags

[2020-10-05T21:42:25.859Z]         shlex.split(show_command), universal_newlines=True).strip()

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 316, in check_output

[2020-10-05T21:42:25.859Z]         **kwargs).stdout

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 383, in run

[2020-10-05T21:42:25.859Z]         with Popen(*popenargs, **kwargs) as process:

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 676, in __init__

[2020-10-05T21:42:25.859Z]         restore_signals, start_new_session)

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 1289, in _execute_child

[2020-10-05T21:42:25.859Z]         raise child_exception_type(errno_num, err_msg)

[2020-10-05T21:42:25.859Z]     FileNotFoundError: [Errno 2] No such file or directory: 'mpicxx'

[2020-10-05T21:42:25.859Z]     

[2020-10-05T21:42:25.859Z]     During handling of the above exception, another exception occurred:

[2020-10-05T21:42:25.859Z]     

[2020-10-05T21:42:25.859Z]     Traceback (most recent call last):

[2020-10-05T21:42:25.859Z]       File "/tmp/pip-install-54uenm_u/horovod/setup.py", line 622, in get_common_options

[2020-10-05T21:42:25.859Z]         mpi_flags = get_mpi_flags()

[2020-10-05T21:42:25.859Z]       File "/tmp/pip-install-54uenm_u/horovod/setup.py", line 354, in get_mpi_flags

[2020-10-05T21:42:25.859Z]         '%s' % (show_command, traceback.format_exc()))

[2020-10-05T21:42:25.859Z]     distutils.errors.DistutilsPlatformError: mpicxx -show failed (see error below), is MPI in $PATH?

[2020-10-05T21:42:25.859Z]     Note: If your version of MPI has a custom command to show compilation flags, please specify it with the HOROVOD_MPICXX_SHOW environment variable.

[2020-10-05T21:42:25.859Z]     

[2020-10-05T21:42:25.859Z]     Traceback (most recent call last):

[2020-10-05T21:42:25.859Z]       File "/tmp/pip-install-54uenm_u/horovod/setup.py", line 341, in get_mpi_flags

[2020-10-05T21:42:25.859Z]         shlex.split(show_command), universal_newlines=True).strip()

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 316, in check_output

[2020-10-05T21:42:25.859Z]         **kwargs).stdout

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 383, in run

[2020-10-05T21:42:25.859Z]         with Popen(*popenargs, **kwargs) as process:

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 676, in __init__

[2020-10-05T21:42:25.859Z]         restore_signals, start_new_session)

[2020-10-05T21:42:25.859Z]       File "/var/lib/jenkins/workspace/gluon-nlp-gpu-py3-master/conda/gpu/py3-master/lib/python3.5/subprocess.py", line 1289, in _execute_child

[2020-10-05T21:42:25.859Z]         raise child_exception_type(errno_num, err_msg)

[2020-10-05T21:42:25.859Z]     FileNotFoundError: [Errno 2] No such file or directory: 'mpicxx'
mli commented 3 years ago

Job PR-1383/2 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1383/2/index.html