spack / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
https://spack.io
Other
4.2k stars 2.24k forks source link

Installation issue: openmpi #16874

Open anderbubble opened 4 years ago

anderbubble commented 4 years ago

Steps to reproduce the issue

$ spack install openmpi@3.1.5 %gcc@8.4.0 fabrics=verbs
==> Error: ProcessError: Command exited with status 2:
    'make' '-j16'

4 errors found in build log:
  >> 5421    /usr/include/infiniband/iba/ib_types.h:42:10: fatal error: complib/cl_types.h: No such file or direc
             tory
     5422     #include <complib/cl_types.h>
     5423              ^~~~~~~~~~~~~~~~~~~~
     5424    compilation terminated.
  >> 5425    make[2]: *** [connect/btl_openib_connect_sl.lo] Error 1

Information on your system

[spring2020] [rcops@shas0137 spring2020]$ spack debug report
* **Spack:** 0.14.2-1266-139fb21
* **Python:** 3.6.8
* **Platform:** linux-rhel7-haswell

This package is being installed as part of a spack envornment. That environment can be found at

https://github.com/ResearchComputing/core-software/tree/sprint-ending-2020-05-11/spack/environments/spring2020

Notably, there is a packages.yaml file here:

https://github.com/ResearchComputing/core-software/blob/sprint-ending-2020-05-11/spack/environments/spring2020/packages.yaml

When I am attempting to build I am using

packages:
  openmpi:
    variants: fabrics=verbs

Additional information

@hppritcha

General information

anderbubble commented 4 years ago
In file included from connect/btl_openib_connect_sl.c:22:
/usr/include/infiniband/iba/ib_types.h:42:10: fatal error: complib/cl_types.h: No such file or directory
 #include <complib/cl_types.h>
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [connect/btl_openib_connect_sl.lo] Error 1
make[2]: Leaving directory `/scratch/local/rcops/spack-stage-openmpi-3.1.5-tl67pz3hcxntn6dmoyuqo7xcfqtap4xo/spack-src/opal/mca/btl/openib'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/scratch/local/rcops/spack-stage-openmpi-3.1.5-tl67pz3hcxntn6dmoyuqo7xcfqtap4xo/spack-src/opal'
make: *** [all-recursive] Error 1
anderbubble commented 4 years ago
[spring2020] [rcops@shas0137 spring2020]$ find / -xdev -type f -name 'cl_types.h' 2>/dev/null                                                                                                                                       
/usr/include/infiniband/complib/cl_types.h
[spring2020] [rcops@shas0137 spring2020]$ rpm -qf /usr/include/infiniband/complib/cl_types.h
opensm-devel-3.3.19-1.el7.x86_64
anderbubble commented 4 years ago

In opal/mca/btl/openib/connect/btl_openib_connect_sl.c:

#include <infiniband/iba/ib_types.h>

Which is picking up /usr/include/infiniband/iba/ib_types.h. This implies that /usr/include is on the include path.

The file it can't find is at /usr/include/infiniband/complib/cl_types.h, but /usr/include/infiniband/iba/ib_types.h is hitting it with the line

#include <complib/cl_types.h>

So it's expecting /usr/include/infiniband to be on the include path.

I don't know the right place to define that, though (assuming that even is the right thing).

hppritcha commented 4 years ago

@anderbubble I'm having problems reproducing this and I know why. When I build in spack using your options the verbs configure output is different than yours

-- MCA component common:verbs (m4 configuration macro)
checking for MCA component common:verbs compile mode... static
checking if want to add padding to the openib BTL control header... no
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
checking for library containing ibv_open_device... -libverbs
checking if libibverbs requires libnl v1 or v3... v3
checking number of arguments to ibv_create_cq... 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... yes
checking whether IBV_ACCESS_SO is declared... no
checking whether IBV_ATOMIC_HCA is declared... yes
checking for ibv_get_device_list... yes
checking for ibv_resize_cq... yes
checking for struct ibv_device.transport_type... yes
checking infiniband/complib/cl_types_osd.h usability... no
checking infiniband/complib/cl_types_osd.h presence... no
checking for infiniband/complib/cl_types_osd.h... no
checking if can use dynamic SL support... no
checking whether IBV_LINK_LAYER_ETHERNET is declared... yes
checking if RDMAoE support is enabled... yes
checking for infiniband/driver.h... no
checking if ConnectX XRC support is enabled... no
checking if ConnectIB XRC support is enabled... no
checking if dynamic SL is enabled... no
checking if MCA component common:verbs can compile... yes

yours

--- MCA component common:verbs (m4 configuration macro)
checking for MCA component common:verbs compile mode... static
checking if want to add padding to the openib BTL control header... no
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
checking for library containing ibv_open_device... -libverbs
checking if libibverbs requires libnl v1 or v3... v3
checking number of arguments to ibv_create_cq... 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... yes
checking whether IBV_ACCESS_SO is declared... no
checking whether IBV_ATOMIC_HCA is declared... yes
checking for ibv_get_device_list... yes
checking for ibv_resize_cq... yes
checking for struct ibv_device.transport_type... yes
checking infiniband/complib/cl_types_osd.h usability... yes
checking infiniband/complib/cl_types_osd.h presence... yes
checking for infiniband/complib/cl_types_osd.h... yes
checking for cl_map_init in -losmcomp... yes
checking if can use dynamic SL support... yes
checking whether IBV_LINK_LAYER_ETHERNET is declared... yes
checking if RDMAoE support is enabled... yes
checking for infiniband/driver.h... no
checking if ConnectX XRC support is enabled... no
checking if ConnectIB XRC support is enabled... no
checking if dynamic SL is enabled... yes
checking if MCA component common:verbs can compile... yes

I wonder if your configury run is picking up pieces installed from spack and verbs support you have installed already on your system outside of spack.

Rashadx86 commented 3 years ago

I'm encountering the exact issue, with infiniband drivers installed on the host os. I can test or provide logs as needed

smarocchi commented 3 years ago

Same problem with spack 16.0 in container singularity centos 7 on architecture ppc64le

In file included from connect/btl_openib_connect_sl.c:22: /usr/include/infiniband/iba/ib_types.h:41:10: fatal error: complib/cl_types.h: No such file or directory

include <complib/cl_types.h>

      ^~~~~~~~~~~~~~~~~~~~

compilation terminated.

but the file is in

Singularity> find / -name cl_types.h /usr/include/infiniband/complib/cl_types.h

Rashadx86 commented 3 years ago

@smarocchi I was able to get past it by defining verbs=/usr/include/infiniband and omitting the fabrics=verbs

smarocchi commented 3 years ago

Dear @Rashadx86, thank you ! Unfortunately I have not grasped where I have to define "verbs=/usr/include/infiniband and omitting the fabrics=verbs"

I show you what I am trying to do:

#############################

Singularity> spack spec -lI openmpi@4.0.3 %gcc@8.4.0 +cuda +pmi +legacylaunchers schedulers=slurm fabrics=ucx,ofi,cma,mxm,hcoll verbs=/usr/include/infiniband ^slurm@20-02-4-1 +pmix ^ucx@1.7.0 +cuda cuda_arch=70 ^python@3.8.2 ^cuda@10.2.89
Input spec
--------------------------------
 -   openmpi@4.0.3%gcc@8.4.0+cuda+legacylaunchers+pmi fabrics=cma,hcoll,mxm,ofi,ucx schedulers=slurm verbs=/usr/include/infiniband
 -       ^cuda@10.2.89
 -       ^python@3.8.2
 -       ^slurm@20-02-4-1+pmix
 -       ^ucx@1.7.0+cuda cuda_arch=70

Concretized
--------------------------------
==> Error: trying to set variant "verbs" in package "openmpi", but the package has no such variant [happened during concretization of openmpi@4.0.3%gcc@8.4.0+cuda+legacylaunchers+pmi fabrics=cma,hcoll,mxm,ofi,ucx schedulers=slurm verbs=/usr/include/infiniband]

#############################

This command is working without verbs neither in the "fabrics=" nor in the "verbs="

Thanks again for your help !

samcom12 commented 3 years ago

I'm getting the same error on ARMv8. Following for answer

akail commented 3 years ago

I also am running into this exact same issue.

Steps to reproduce the issue

spack install --overwrite openmpi@3.1.6 schedulers=slurm fabrics=verbs +pmi

==> Error: ProcessError: Command exited with status 2:                                                                                                                                                         
     5650    In file included from connect/btl_openib_connect_sl.c:22:                                                                                                                                         
  >> 5651    /usr/include/infiniband/iba/ib_types.h:42:10: fatal error: complib/cl_types.h: No such file or directory                                                                                          
     5652       42 | #include <complib/cl_types.h>                                                                                                                                                             

Information on your system

* **Spack:** 0.16.0-410-eca1dd873
* **Python:** 3.6.8
* **Platform:** linux-centos8-skylake_avx512
* **Concretizer:** original

Additional Information

novakmcfd commented 1 year ago

The issue is that on the system there is opensm installed, while in spack there is only rdma-core. OpenMPI configure picks up opensm from the system, while taking the rest of ininiband contents from spack rdma-core, causing the build to fail.

I would suggest adding the opensm as a spack package and add as dependency to openmpi fabrics=verbs.

alberto-scolari commented 9 months ago

Hello everybody, I had the same problem and incurred into this issue. The workaround I found was to add dedicated cflags to the installation command:

spack install <other options...> openmpi cflags=-I/usr/include/infiniband <dependencies...>

It worked for me, I hope it helps here too @smarocchi.

simarocchi commented 9 months ago

Thanks @alberto-scolari. I am not working with Spack recently but I will pass this information to my ex colleagues.

Cheers, S.