NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.74k stars 1.52k forks source link

TMPDIR handling by nix-shell #395

Open michaeljones opened 9 years ago

michaeljones commented 9 years ago

Hi,

Lethalman in #nixos has asked me to report this. I've had trouble with building a Haskell Yesod project using a nix-shell on Ubuntu 14.04. I get a GHC panic error when building with:

nix-shell --pure
eval "$configurePhase"
eval "$buildPhase"

But not when building with nix-build. My setup is the standard code that you get from yesod init along with the following default.nix:

{ haskellPackages ? (import <nixpkgs> {}).haskellPackages }:

let inherit (haskellPackages);

in with haskellPackages; cabal.mkDerivation (self: {
  pname = "cal";
  version = "0.0.1";
  src = ./.;
  buildDepends = with haskellPackages; [
    yesod yesodStatic yesodTest
    yesodBin
    hjsmin persistentSqlite hspec
    ];
  buildTools = with haskellPackages; [ cabalInstall ];
})

It seems to come down to my normal shell not having TMPDIR set (I'm not sure why) so when I use nix-shell --pure it is picking up the XDG_RUNTIME_DIR=/run/user/1000/ and using that as the TMP directory. Lethalman pointed to this line.

That directory was then potentially upsetting GHC as it didn't have the correct permissions or something. I'm unfortunately new to nix & haskell so I am reporting what others have told me. Happy to help where I can though.

Cheers, Michael

lucabrunox commented 9 years ago

Thanks. Can you please show mount|grep /run/user? I fear ubuntu is using noexec?

michaeljones commented 9 years ago

Here we go:

$ mount|grep /run/user
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,user=mike)
lucabrunox commented 9 years ago

Anyway, it's better we do mktemp -d instead of falling back to XDG_RUNTIME_DIR at this point. Especially since it may fill up the ram with all those .so files. cc @edolstra

edolstra commented 9 years ago

Apparently on your system /run/user is mounted with the noexec flag, which appears to violate the XDG spec ("The directory MUST by fully-featured by the standards of the operating system."). Some discussion here: http://lists.freedesktop.org/archives/systemd-devel/2014-March/017967.html

I guess we could try to detect whether XDG_RUNTIME_DIR is broken...

The reason for not using mktemp -d is that a user may want to keep the same $TMPDIR across multiple nix-shell calls, though that's debatable.

bennofs commented 9 years ago

@edolstra IMO, nix-shell's $TMPDIR should be cleared when the shell exists. Some builders use $TMPDIR to build wrappers, and I don't like them to be shared by multiple nix-shell environments. If you want files to preserved, relying on $TMPDIR is a bad idea.

mboes commented 8 years ago

Why does nix-shell set TMPDIR to begin with? One issue @ypares @davidar and I running into is that /tmp/ is disk-backed, while the default TMPDIR of /run/user/$UID is memory backed, so sometimes temporary files created by compilers are big enough that the user runs out of memory.

brodul commented 7 years ago

Just want to share my story. The company I work for is doing quite some stuff in the shellHook the hook failed (on CI) because I was not aware that the TMPDIR is in memory. Running TMPDIR="$(mktemp -d)" nix-shell fixed it. Maybe this should be mentioned in the manual somewhere?

LisannaAtHome commented 6 years ago

Ran into this today. nix-shell and nix-build shouldn't be setting TMPDIR to /run. Any process which expects to be able to create large files in a mktemp -d is going to have a bad day if they're run in a nix-build or a nix-shell.

chreekat commented 6 years ago

I was just bit by this as well. /tmp is, well, for temporary files. Why not use it?

edolstra commented 6 years ago

On most modern systems, /tmp is also a tmpfs so it doesn't really matter whether we use /tmp or /run.

chreekat commented 6 years ago

I wanted to be snarky and say, "Is NixOS 17.09 not sufficiently modern?" :) But I've found the boot.tmpOnTmpfs option and I'll try it out.

I suspect in my case, a more directed solution would be to (a) make cabal clean up after itself during large sandbox installs as it goes along, and (b) make sure its error messages point more directly to the root problem (i.e. being out of disk space) when a problem occurs.

I don't think this solves @ledettwy's problem, however.

fkorotkov commented 6 years ago

@ledettwy have you found a workaround? @brodul's workaround with TMPDIR="$(mktemp -d)" ddn't work for me.

I'm trying to run a CI build with Nix and mktemp doesn't work for me as well. Here is a small snippet to reproduce:

fedor-mbp $: nix-shell --packages nodejs --run "mktemp -d -t foo"
mktemp: too few X's in template ‘foo’
LisannaAtHome commented 6 years ago

@fkorotkov TMPDIR="$(mktemp -d)" makes me nervous only because I'm not sure what TMPDIR is set to when that command is being run. Try a normal TMPDIR=/tmp nix-shell

fkorotkov commented 6 years ago

@ledettwy tried it and seems it's passing it to the environment but mktemp still fails:

fedor-mbp master$: TMPDIR=/tmp nix-shell --packages nodejs --run "printenv | grep TMP && mktemp -d -t foo"
TMP=/tmp
TMPDIR=/tmp
mktemp: too few X's in template ‘foo’

I'm on Darwin BTW and I'm trying to run xcodebuild which uses mktemp. Maybe it's just Mac OS related.

fkorotkov commented 6 years ago

I've pin pointed the issue. In my case the problem is with coreutils package:

fedor-mbp$: nix-shell --packages nodejs --run "which mktemp && mktemp -d -t foo"
/nix/store/s28xb9v7xf6axvf4a3av2mnczws2hsdg-coreutils-8.29/bin/mktemp
mktemp: too few X's in template ‘foo’

As you can see mktemp from coreutils-8.29 fails. If I use /usr/bin/mktemp everything works:

fedor-mbp $: nix-shell --packages nodejs --run "/usr/bin/mktemp -d -t foo"
/var/folders/_r/wmcjfw296f56jrysrxtrx7sc0000gn/T/foo.y5KFSLBM
aschmolck commented 6 years ago

@fkorotkov In case this is not obsolete already: don't pass -t at all or do mktemp -d -t fooXXXXXXXXXXX.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

terlar commented 3 years ago

I think the confusing part here is that it fallbacks to XDG_RUNTIME_DIR if TMPDIR is not set. I have several team mates running into this, especially on Ubuntu which creates really small/limited space for the XDG_RUNTIME_DIR (it is way less than memory available).

One of the users argued that this is by design due to this location having the purpose of just storing lock files and other runtime related stuff that essentially doesn't occupy any significant space. If this is the case it feels bad to fallback to using this directory.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

quintindk commented 2 years ago

+1; this problem is exacerbated on WSL2 where the XDG_RUNTIME_DIR is set to /mnt/wslg/runtime-dir and permissions are broken. Could we please remove the XDG_RUNTIME_DIR in future releases and default to /tmp?

oxalica commented 1 year ago

On most modern systems, /tmp is also a tmpfs so it doesn't really matter whether we use /tmp or /run.

No. They are indeed different. XDG_RUNTIME_DIR should generally not be used for files other than communication and synchronization, eg. socket and FIFO. And it is mandatory to be removed after user logout, which is not we want. Also to me that it's possible to mount /run as memfs instead of tmpfs, which prevents swapping at all and causes more memory stress.

Here's some quotes from XDG spec:

$XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored.

Files in the directory MUST not survive reboot or a full logout/login cycle.

Applications should use this directory for communication and synchronization purposes and should not place larger files in it, since it might reside in runtime memory and cannot necessarily be swapped out to disk.

I also encountered multiple people complaining about ENOSPC when build inside nix-shell while no building issues outside. So I consider this TMPDIR value is surprising in general. Either unsetting it or let TMPDIR=/tmp could be better.

jalaziz commented 1 year ago

I recently ran into this issue when using Nix on GitHub Actions. We were running out of space when running go test which didn't make sense since our project isn't that big.

After some investigation, I realized that NIX_BUILD_TOP was set to /run/user/1001 which is a mount with < 700 MB on GitHub runners.

It took me a long time to find this.

I don't know what the right answer is here, but it was quite unexpected.

whentze commented 12 months ago

+1, we're still hitting ENOSPC because of this. The choice of $XDG_RUNTIME_DIR for $TMPDIR is both wrong from the PoV of the XDG spec and causes a bunch of problems in practice.

whentze commented 12 months ago

FWIW, on my NixOS 23.05, there is not only a space limit, but a relatively tight inode limit too:

$ findmnt $XDG_RUNTIME_DIR
TARGET         SOURCE FSTYPE OPTIONS
/run/user/1000 tmpfs  tmpfs  rw,nosuid,nodev,relatime,size=4854176k,nr_inodes=1213544,mode=700,uid=1000,gid=100

This is easily exhausted even when not using a bunch of space, by tools that make a lot of tempfiles like NPM.