spack / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
https://spack.io
Other
4.22k stars 2.25k forks source link

NFS: spack install fails if installation directory is not world-writable #21454

Open odoublewen opened 3 years ago

odoublewen commented 3 years ago

Spack seems to get tripped up when I try to install it on our cluster NFS and if the spack installation dir is not world-writable. I get an error when it tries to set directory permissions immediately after creating a directory.

Some NFS systems exhibit latency on directory/file creation and initially I thought that might explain the phenomenon. But now I'm not so sure.

Installs work fine on this same system if I use a local hard drive.

Steps to reproduce the issue

$ ./spack/bin/spack -d -L install -v libuuid
...

(I just picked libuuid b/c it has no deps and is small.  But the error happens for ANY package.)

I discovered that a workaround is to do a `chmod -R a+w spack` -- but that shouldn't be necessary, should it?

Error Message

$ ./spack/bin/spack -d -L install -v libuuid
==> [2021-02-02-22:09:02.249364] Imported install from built-in commands
==> [2021-02-02-22:09:02.262221] Reading config file /net/ifs/home/jdoe/spacktest/spack/etc/spack/defaults/config.yaml
==> [2021-02-02-22:09:02.320362] Imported install from built-in commands
==> [2021-02-02-22:09:02.322635] Reading config file /net/ifs/home/jdoe/spacktest/spack/etc/spack/defaults/repos.yaml
==> [2021-02-02-22:09:05.596419] Reading config file /net/ifs/home/jdoe/spacktest/spack/etc/spack/defaults/packages.yaml
==> [2021-02-02-22:09:05.664260] Reading config file /locus/home/osolberg/.spack/linux/compilers.yaml
==> [2021-02-02-22:09:05.718041] DATABASE LOCK TIMEOUT: 3s
==> [2021-02-02-22:09:05.718194] PACKAGE LOCK TIMEOUT: No timeout
==> [2021-02-02-22:09:05.726703] Initializing the build queue from the build requests
==> [2021-02-02-22:09:05.726809] Initializing the build queue for libuuid
==> [2021-02-02-22:09:05.728262] Processing dependencies for libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr: ('build', 'link', 'run')
==> [2021-02-02-22:09:05.756080] Removing failure marking for libuuid
==> [2021-02-02-22:09:05.764273] Pkg id libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr has the following dependents:
==> [2021-02-02-22:09:05.764745] Ensure all dependencies know all dependents across specs
==> [2021-02-02-22:09:05.765746] Acquiring a write lock on libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr with timeout 1e-09
==> [2021-02-02-22:09:05.774534] Creating stage lock spack-stage-libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr
==> [2021-02-02-22:09:05.778443] Installing libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr
==> [2021-02-02-22:09:05.778533] Searching for binary cache of libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr
==> [2021-02-02-22:09:05.778680] Reading config file /net/ifs/home/jdoe/spacktest/spack/etc/spack/defaults/mirrors.yaml
==> [2021-02-02-22:09:05.842581] Did not find linux-ubuntu18.04-sandybridge-gcc-7.5.0-libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr.spec.yaml on https://spack-llnl-mirror.s3-us-west-2.amazonaws.com/build_cache/linux-ubuntu18.04-sandybridge-gcc-7.5.0-libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr.spec.yaml
  Download failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)>
==> [2021-02-02-22:09:05.842814] No binary for libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr found: installing from source
==> [2021-02-02-22:09:05.905966] Flagging libuuid-1.0.3-h2smghecjkqvg4xeyrrxs4y5vhmkgjqr as failed: [Errno 22] Invalid argument: '/net/ifs/home/jdoe/spacktest/spack/opt/spack/linux-ubuntu18.04-sandybridge'
==> [2021-02-02-22:09:05.910506] Error: Failed to install libuuid due to OSError: [Errno 22] Invalid argument: '/net/ifs/home/jdoe/spacktest/spack/opt/spack/linux-ubuntu18.04-sandybridge'
Traceback (most recent call last):
  File "./spack/bin/spack", line 68, in <module>
    sys.exit(spack.main.main())
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/main.py", line 762, in main
    return _invoke_command(command, parser, args, unknown)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/main.py", line 490, in _invoke_command
    return_val = command(parser, args)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/cmd/install.py", line 371, in install
    install_specs(args, kwargs, zip(abstract_specs, specs))
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/cmd/install.py", line 213, in install_specs
    builder.install()
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/installer.py", line 1534, in install
    self._install_task(task)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/installer.py", line 1117, in _install_task
    self._setup_install_dir(pkg)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/installer.py", line 1270, in _setup_install_dir
    spack.store.layout.create_install_directory(pkg.spec)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/spack/directory_layout.py", line 312, in create_install_directory
    mkdirp(spec.prefix, mode=perms, group=group, default_perms='parents')
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/llnl/util/filesystem.py", line 630, in mkdirp
    raise e
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/llnl/util/filesystem.py", line 623, in mkdirp
    chgrp_if_not_world_writable(intermediate_path,
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/llnl/util/filesystem.py", line 549, in chgrp_if_not_world_writable
    chgrp(path, group)
  File "/net/ifs/home/jdoe/spacktest/spack/lib/spack/llnl/util/filesystem.py", line 310, in chgrp
    os.chown(path, -1, gid)
OSError: [Errno 22] Invalid argument: '/net/ifs/home/jdoe/spacktest/spack/opt/spack/linux-ubuntu18.04-sandybridge'

But note that the install dir doesn't exist:

$ ls /net/ifs/home/jdoe/spacktest/spack/opt/spack/linux-ubuntu18.04-sandybridge
ls: cannot access '/net/ifs/home/jdoe/spacktest/spack/opt/spack/linux-ubuntu18.04-sandybridge': No such file or directory

Information on your system

$ ./spack/bin/spack debug report
* **Spack:** 0.16.0-1101-7dcf3f7aed
* **Python:** 3.8.1
* **Platform:** linux-ubuntu18.04-sandybridge
* **Concretizer:** original

Additional information

odoublewen commented 3 years ago

Actually, if I make the spack installation dir world-writable, most of the installs succeed, but there are stochastic failures, with various stack traces, but it's always because a file doesn't exist, e.g.:

==> Installing berkeley-db-18.1.40-x7y5y36hsslijbvuuyatqbyjyjrpppsl
==> No binary for berkeley-db-18.1.40-x7y5y36hsslijbvuuyatqbyjyjrpppsl found: installing from source
==> Fetching https://spack-llnl-mirror.s3-us-west-2.amazonaws.com/_source-cache/archive/0c/0cecb2ef0c67b166de93732769abdeba0555086d51de1090df325e18ee8da9c8.tar.gz
####################################################################################################################################################################### 100.0%
==> berkeley-db: Executing phase: 'autoreconf'
==> berkeley-db: Executing phase: 'configure'
==> berkeley-db: Executing phase: 'build'
==> berkeley-db: Executing phase: 'install'
==> Error: OSError: [Errno 22] Invalid argument: '/net/ifs/home/jdoe/spack/opt/spack/linux-ubuntu18.04-sandybridge/gcc-7.5.0/berkeley-db-18.1.40-x7y5y36hsslijbvuuyatqbyjyjrpppsl/lib/libdb-18.1.la'

/net/ifs/home/jdoe/spack/lib/spack/spack/build_systems/autotools.py:551, in remove_libtool_archives:
        548            return
        549
        550        # Remove the files and create a log of what was removed
  >>    551        libtool_files = fs.find(str(self.prefix), '*.la', recursive=True)
        552        with fs.safe_remove(*libtool_files):
        553            fs.mkdirp(os.path.dirname(self._removed_la_files_log))
        554            with open(self._removed_la_files_log, mode='w') as f:

This may be due to misconfiguration or some obscure idiosyncrasies of our NFS mount. If you want to close this, I would understand. But I'm willing to help you understand more about the problem if you think it's worth pursuing.