open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.14k stars 859 forks source link

4.0.0 does not build with very long paths #6106

Closed ax3l closed 5 years ago

ax3l commented 5 years ago

Background information

What version of Open MPI are you using?

4.0.0

Describe how Open MPI was installed

Trying to install via spack (3557b6e14903c02bcc0a19b34c8b4c64f3d05ca3) from source with GCC/GFortran 4.9.2:

spack install -v openmpi %gcc@4.9.2

Please describe the system on which you are running

Dependencies:

$ spack spec openmpi %gcc@4.9.2                                                                                                         (develop $=)
Input spec
--------------------------------
openmpi%gcc@4.9.2

Concretized
--------------------------------
openmpi@4.0.0%gcc@4.9.2~cuda+cxx_exceptions fabrics= ~java~legacylaunchers~memchecker~pmi schedulers= ~sqlite3~thread_multiple+vt arch=linux-debian9-x86_64 
    ^hwloc@1.11.9%gcc@4.9.2~cairo~cuda+libxml2+pci+shared arch=linux-debian9-x86_64 
        ^libpciaccess@0.13.5%gcc@4.9.2 arch=linux-debian9-x86_64 
            ^libtool@2.4.6%gcc@4.9.2 arch=linux-debian9-x86_64 
                ^m4@1.4.18%gcc@4.9.2 patches=3877ab548f88597ab2327a2230ee048d2d07ace1062efe81fc92e91b7f39cd00,c0a408fbffb7255fcc75e26bd8edab116fc81d216bfd18b473668b7739a4158e,fc9b61654a3ba1a8d6cd78ce087e7c96366c290bc8d2c299f09828d793b853c8 +sigsegv arch=linux-debian9-x86_64 
                    ^libsigsegv@2.11%gcc@4.9.2 arch=linux-debian9-x86_64 
            ^pkgconf@1.5.4%gcc@4.9.2 arch=linux-debian9-x86_64 
            ^util-macros@1.19.1%gcc@4.9.2 arch=linux-debian9-x86_64 
        ^libxml2@2.9.8%gcc@4.9.2~python arch=linux-debian9-x86_64 
            ^xz@5.2.4%gcc@4.9.2 arch=linux-debian9-x86_64 
            ^zlib@1.2.11%gcc@4.9.2+optimize+pic+shared arch=linux-debian9-x86_64 
        ^numactl@2.0.11%gcc@4.9.2 patches=592f30f7f5f757dfc239ad0ffd39a9a048487ad803c26b419e0f96b8cda08c1a arch=linux-debian9-x86_64 
            ^autoconf@2.69%gcc@4.9.2 arch=linux-debian9-x86_64 
                ^perl@5.26.2%gcc@4.9.2+cpanm patches=0eac10ed90aeb0459ad8851f88081d439a4e41978e586ec743069e8b059370ac +shared+threads arch=linux-debian9-x86_64 
                    ^gdbm@1.18.1%gcc@4.9.2 arch=linux-debian9-x86_64 
                        ^readline@7.0%gcc@4.9.2 arch=linux-debian9-x86_64 
                            ^ncurses@6.1%gcc@4.9.2~symlinks~termlib arch=linux-debian9-x86_64 
            ^automake@1.16.1%gcc@4.9.2 arch=linux-debian9-x86_64

Details of the problem

Build, configured with:

configure \
  --prefix=/home/axel/src/spack/opt/spack/linux-debian9-x86_64/gcc-4.9.2/openmpi-4.0.0-ayqox3e57xqu5722p2ond4yrp2qkka2f \
  --enable-shared --with-wrapper-ldflags= \
  --enable-static --without-pmi --enable-mpi-cxx \
  --with-zlib=/home/axel/src/spack/opt/spack/linux-debian9-x86_64/gcc-4.9.2/zlib-1.2.11-hakzmqzqipebyjhek2v62dvvshwxaruw \
  --without-psm --without-psm2 --without-verbs \
  --without-mxm --without-ucx --without-libfabric \
  --without-alps --without-lsf --without-tm \
  --without-slurm --without-sge --without-loadleveler \
  --disable-memchecker \
  --with-hwloc=/home/axel/src/spack/opt/spack/linux-debian9-x86_64/gcc-4.9.2/hwloc-1.11.9-qvx3bc3megmbngndb4jmul55bxkc4tvw \
  --disable-java --disable-mpi-java --without-cuda --enable-cxx-exceptions

fails in make -j 4 with:

make[2]: Entering directory '/tmp/axel/spack-stage/spack-stage-qen9pq9b/openmpi-4.0.0/ompi/mpi/fortran/mpiext-use-mpi'
  PPFC     mpi-ext-module.lo
mpi-ext-module.F90:29.6:

      include '/home/axel/src/spack/var/spack/stage/openmpi-4.0.0-ayqox3e57xqu5722p2ond4yrp2qkka2f/openmpi-4.0.0/ompi/mpiext/pcollre
      1
Error: Unclassifiable statement at (1)
Makefile:1779: recipe for target 'mpi-ext-module.lo' failed
make[2]: *** [mpi-ext-module.lo] Error 1
make[2]: Leaving directory '/tmp/axel/spack-stage/spack-stage-qen9pq9b/openmpi-4.0.0/ompi/mpi/fortran/mpiext-use-mpi'
Makefile:3521: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/tmp/axel/spack-stage/spack-stage-qen9pq9b/openmpi-4.0.0/ompi'
Makefile:1896: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

Full build output: spack-build.out.gz

ggouaillardet commented 5 years ago

Thanks for the report !

As a workaround, you can configure --enable-mpi-ext=affinity,cuda ....

At first glance, the root cause is this Fortran line is too long, and I suspect this is because you invoke /full_path/configure (and I guess spack will not change that in a very near future, which is very acceptable for me.

Anyway, I will explore some options on how to avoid this issue.

ggouaillardet commented 5 years ago

@hppritcha @bwbarrett FYI, we might want to raise the severity of this issue, and release 4.0.1 earlier than expected (I am still unclear on how we should deal the oshmem situation that was reported on the mailing list)

jsquyres commented 5 years ago

@ggouaillardet It's pretty common for us to release an x.y.1 version pretty soon after x.y.0, because inevitably people discover things in .0 that we missed in testing. 🙁

jsquyres commented 5 years ago

@ggouaillardet I merged #6109 into master. I don't think that this was technically a regression (i.e., long paths would have caused the same problem for quite a while), but the new extensions certainly did shine a light on the issue. So we should definitely get this fix into v4.0.1.

ggouaillardet commented 5 years ago

@jsquyres this is not a regression (the issue has always been here) and we only face it now because pcollreq is the first MPI extension that has some Fortran subroutines. I made #6121 for the v4.0.x branch

jsquyres commented 5 years ago

I just updated the title of this issue to reflect that it's not the compiler that is the issue -- it's the very-long-path that is the issue.

jsquyres commented 5 years ago

I think this is now resolved. Closing.