DeterminateSystems / nix-installer

Install Nix and flakes with the fast and reliable Determinate Nix Installer, with over 7 million installs.
https://determinate.systems
GNU Lesser General Public License v2.1
2.3k stars 59 forks source link

Revert has Bus Errors on Mac, rebooting on Mac results in no `nix` #400

Open chrisguida opened 1 year ago

chrisguida commented 1 year ago

Got this error after I installed nix, then rebooted. nix was unavailable in my shell after the reboot, so i tried to reinstall it. Got this message

Found existing plan in `/nix/receipt.json`, with the same settings, already completed, try uninstalling (`/nix/nix-installer uninstall`) and reinstalling if Nix isn't working

so I tried uninstalling it curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- uninstall.

This resulted in a new error:

Proceed? ([Y]es/[n]o/[e]xplain): y
 INFO Revert: Remove directory `/nix/temp-install-dir`
 INFO Revert: Configure Nix daemon related settings with launchctl
 INFO Revert: Configure Nix
 INFO Revert: Provision Nix
 INFO Revert: Create an APFS volume `Nix Store` for Nix on `disk3` and add it to `/etc/fstab` mounting on `/nix`
sh: line 244: 17967 Bus error: 10           "$@"

Then I tried to reinstall, but that resulted in this error:

cguida@cg-mac-2 ~/w/lightning (master) [0|1]> curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- uninstall
info: downloading installer https://install.determinate.systems/nix/tag/v0.7.0/nix-installer-aarch64-darwin
`nix-installer` needs to run as `root`, attempting to escalate now via `sudo`...
Error: 
   0: Reading receipt
   1: No such file or directory (os error 2)

Location:
   src/cli/subcommand/uninstall.rs:101

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Consider reporting this error using this URL: https://github.com/DeterminateSystems/nix-installer/issues/new?title=%3Cautogenerated-issue%3E&body=%23%23+Error%0A%60%60%60%0AError%3A+%0A+++0%3A+Reading+receipt%0A+++1%3A+No+such+file+or+directory+%28os+error+2%29%0A%60%60%60%0A%0A%23%23+Metadata%0A%7Ckey%7Cvalue%7C%0A%7C--%7C--%7C%0A%7C**version**%7C0.7.0%7C%0A%7C**os**%7Cmacos%7C%0A%7C**arch**%7Caarch64%7C%0A

It appears I am now unable to install or uninstall Nix from my system.

So I clicked the link and here we are.

Error

Error: 
   0: Reading receipt
   1: No such file or directory (os error 2)

Metadata

key value
version 0.7.0
os macos
arch aarch64
chrisguida commented 1 year ago

Oh hmm, I ran the install script again and now it works... hopefully there are no issues with my new install. I wonder if the fact that I'm using fish affects things?

Hoverbear commented 1 year ago

Oh darn!

So it seems like we got an error at some point while deleting the APFS volume. I've not seen a "bus error: 10" before, but cursory searching suggests there is something to do with memory allocation involved. I suspect this is happening when we call diskutil at some point to delete the APFS volume.

I suspect the reason your install didn't work in the code below "Then I tried to reinstall, but that resulted in this error:" is because you actually ran uninstall! This is part of why we suggest you use /nix/nix-installer uninstall in the first error message you got!

chrisguida commented 1 year ago

Ah yes, good catch! Silly me. Makes sense why the second time I tried to install it actually worked xD

Hoverbear commented 1 year ago

Have you confirmed rebooting that nix persists and works for you? If so, I'm really glad we got it working! I'll close the ticket. Thanks for reporting it!

If it doesn't work, please reopen the ticket and we'll get you sorted!

chrisguida commented 1 year ago

No, still missing :/

> nix --version
fish: Unknown command: nix

Did you fix this issue somewhere? perhaps i'm not running the latest code?

Also, I don't have a reopen button, so I can't reopen the issue.

Hoverbear commented 1 year ago

Shucks. Could you give me some more information?

Is encryption on?

% fdesetup status
FileVault is Off.

If it is, can I see the output of this?

% cat /Library/LaunchDaemons/org.nixos.darwin-store.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>RunAtLoad</key>
        <true/>
        <key>Label</key>
        <string>org.nixos.darwin-store</string>
        <key>ProgramArguments</key>
        <array>
                <string>/usr/sbin/diskutil</string>
                <string>mount</string>
                <string>-mountPoint</string>
                <string>/nix</string>
                <string>C2FC7C2B-6FB4-497C-AEA2-9993CE281CF8</string>
        </array>
</dict>
</plist>% 

What's the output of this?

% diskutil apfs list | grep -C 6 Nix 
    |   Sealed:                    No
    |   FileVault:                 No
    |
    +-> Volume disk3s7 C2FC7C2B-6FB4-497C-AEA2-9993CE281CF8
        ---------------------------------------------------
        APFS Volume Disk (Role):   disk3s7 (No specific role)
        Name:                      Nix Store (Case-insensitive)
        Mount Point:               /nix
        Capacity Consumed:         70529024 B (70.5 MB)
        Sealed:                    No
        FileVault:                 No (Encrypted at rest)

And this?

% cat /etc/fstab 

# nix-installer created volume labelled `Nix Store`
UUID=c2fc7c2b-6fb4-497c-aea2-9993ce281cf8 /nix apfs rw,noauto,nobrowse,suid,owners% 

These two:

% launchctl print system/org.nixos.nix-daemon          
system/org.nixos.nix-daemon = {
        active count = 1
        path = /Library/LaunchDaemons/org.nixos.nix-daemon.plist
        type = LaunchDaemon
        state = running
...
% launchctl print system/org.nixos.darwin-store
system/org.nixos.darwin-store = {
        active count = 0
        path = /Library/LaunchDaemons/org.nixos.darwin-store.plist
        type = LaunchDaemon
        state = not running
...
        exit timeout = 5
        runs = 2
        last exit code = 0
...

(Note the exit code on the last one, please!)

chrisguida commented 1 year ago
> fdesetup status
FileVault is On.
> cat /Library/LaunchDaemons/org.nixos.darwin-store.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>RunAtLoad</key>
    <true/>
    <key>Label</key>
    <string>org.nixos.darwin-store</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/sbin/diskutil</string>
        <string>mount</string>
        <string>-mountPoint</string>
        <string>/nix</string>
        <string>A454717D-8FBA-4C59-8382-14B0421C1F51</string>
    </array>
</dict>
</plist>⏎
> diskutil apfs list | grep -C 6 Nix 
    |   Sealed:                    No
    |   FileVault:                 No
    |
    +-> Volume disk3s7 A454717D-8FBA-4C59-8382-14B0421C1F51
        ---------------------------------------------------
        APFS Volume Disk (Role):   disk3s7 (No specific role)
        Name:                      Nix Store (Case-insensitive)
        Mount Point:               /nix
        Capacity Consumed:         1207877632 B (1.2 GB)
        Sealed:                    No
        FileVault:                 No (Encrypted at rest)
> cat /etc/fstab 

# nix-installer created volume labelled `Nix Store`
UUID=a454717d-8fba-4c59-8382-14b0421c1f51 /nix apfs rw,noauto,nobrowse,suid,owners⏎ 
> launchctl print system/org.nixos.nix-daemon   
system/org.nixos.nix-daemon = {
    active count = 1
    path = /Library/LaunchDaemons/org.nixos.nix-daemon.plist
    type = LaunchDaemon
    state = running

    program = /bin/sh
    arguments = {
        /bin/sh
        -c
        /bin/wait4path /nix/var/nix/profiles/default/bin/nix-daemon && exec /nix/var/nix/profiles/default/bin/nix-daemon
    }

    stdout path = /dev/null
    stderr path = /var/log/nix-daemon.log
    default environment = {
        PATH => /usr/bin:/bin:/usr/sbin:/sbin
    }

    environment = {
        OBJC_DISABLE_INITIALIZE_FORK_SAFETY => YES
        NIX_SSL_CERT_FILE => /nix/var/nix/profiles/default/etc/ssl/certs/ca-bundle.crt
        XPC_SERVICE_NAME => org.nixos.nix-daemon
    }

    domain = system
    minimum runtime = 10
    exit timeout = 5
    runs = 2
    pid = 19440
    immediate reason = inefficient
    forks = 3
    execs = 3
    initialized = 1
    trampolined = 1
    started suspended = 0
    proxy started suspended = 0
    last terminating signal = Terminated: 15

    spawn type = daemon (3)
    jetsam priority = 40
    jetsam memory limit (active) = (unlimited)
    jetsam memory limit (inactive) = (unlimited)
    jetsamproperties category = daemon
    submitted job. ignore execute allowed
    jetsam thread limit = 32
    cpumon = default
    resource limits = {
        maxfiles (soft) => 1048576
    }

    probabilistic guard malloc policy = {
        activation rate = 1/1000
        sample rate = 1/0
    }

    properties = keepalive | runatload | inferred program
}
> launchctl print system/org.nixos.darwin-store
system/org.nixos.darwin-store = {
    active count = 0
    path = /Library/LaunchDaemons/org.nixos.darwin-store.plist
    type = LaunchDaemon
    state = not running

    program = /usr/sbin/diskutil
    arguments = {
        /usr/sbin/diskutil
        mount
        -mountPoint
        /nix
        A454717D-8FBA-4C59-8382-14B0421C1F51
    }

    default environment = {
        PATH => /usr/bin:/bin:/usr/sbin:/sbin
    }

    environment = {
        XPC_SERVICE_NAME => org.nixos.darwin-store
    }

    domain = system
    minimum runtime = 10
    exit timeout = 5
    runs = 2
    last exit code = 0

    spawn type = daemon (3)
    jetsam priority = 40
    jetsam memory limit (active) = (unlimited)
    jetsam memory limit (inactive) = (unlimited)
    jetsamproperties category = daemon
    jetsam thread limit = 32
    cpumon = default
    probabilistic guard malloc policy = {
        activation rate = 1/1000
        sample rate = 1/0
    }

    properties = runatload | inferred program
}
chrisguida commented 1 year ago

Possibly this particular setup is messed up because i did it in bash then switched back to fish?

But nix is unavailable now in both fish and bash, so I'm thinking it's something else.

Hoverbear commented 1 year ago

So I do see one issue, apparently your disk is encrypted and the volume we're making is not. I guess our detection code is still not working even after https://github.com/DeterminateSystems/nix-installer/pull/361, that's really frustrating.

I am not sure if that is actually a problem for your Nix working or it's unrelated, but I need to fix that.

So it does look like you have the /nix path and the disk is mounted. So that means the likely culprit is the shell profiles.

I'm sorry but I need even more information! :sweat:

Do you happen to have it in your path?

% bash -c 'echo $PATH'
/Users/ephemeraladmin/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

Is the profile populated?

% ls /nix/var/nix/profiles/default/bin
nix                     nix-channel             nix-copy-closure        nix-env                 nix-instantiate         nix-shell
nix-build               nix-collect-garbage     nix-daemon              nix-hash                nix-prefetch-url        nix-store

Is it in your /etc/bashrc?

% cat /etc/bashrc    

# Nix
if [ -e '/nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh' ]; then
    . '/nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh'
fi
# End Nix

...
chrisguida commented 1 year ago

Nope

> bash -c 'echo $PATH'
/Users/cguida/.deno/bin:/Users/cguida/.cargo/bin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cguida/Library/Python/3.8/bin:/Library/Apple/usr/bin

Yes, there's a thing here

> ls -al /nix/var/nix/profiles/default/bin
lrwxr-xr-x  1 root  wheel  58 Dec 31  1969 /nix/var/nix/profiles/default/bin@ -> /nix/store/8w0v2mffa10chrf1h66cbvbpw86qmh85-nix-2.13.3/bin

If I put a trailing slash, there's stuff inside as well, all pointing to nix:

cguida@cg-mac-2 ~/w/watchdescriptor (wip/implement)> ls -al /nix/var/nix/profiles/default/bin/
total 5712
drwxr-xr-x  14 root  wheel      448 Apr  3 15:00 ./
drwxr-xr-x   8 root  wheel      256 Apr  3 15:00 ../
-rwxr-xr-x   1 root  wheel  2920800 Dec 31  1969 nix*
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-build@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-channel@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-collect-garbage@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-copy-closure@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-daemon@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-env@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-hash@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-instantiate@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-prefetch-url@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-shell@ -> nix
lrwxr-xr-x   1 root  wheel        3 Apr  3 15:00 nix-store@ -> nix

Looks like there's stuff in my root bashrc. I'm switching between sometimes using fish and sometimes bash though, so this may be the source of my issues?

> cat /etc/bashrc

# Nix
if [ -e '/nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh' ]; then
    . '/nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh'
fi
# End Nix

# System-wide .bashrc file for interactive bash(1) shells.
if [ -z "$PS1" ]; then
   return
fi

PS1='\h:\W \u\$ '
# Make bash check its window size after a process completes
shopt -s checkwinsize

[ -r "/etc/bashrc_$TERM_PROGRAM" ] && . "/etc/bashrc_$TERM_PROGRAM"
robstolarz commented 1 year ago

Hello,

I have encountered this issue too. It appears to stem from when the volume that the installer binary resides on is deleted from under it. This invalidates the memory mapping that the OS holds and executes the binary from, which causes the OS to be unable to read the memory containing the program during execution. This differs from a typical deletion, where the mapping is still technically available while the file is deleted, leading to the unusual error.

I've resolved it locally by copying the binary to a temporary directory before invoking it (and ensuring the rest of the env refers explicitly away from the Nix env that's about to be obliterated). Perhaps the installer could do this automatically.

Hoverbear commented 1 year ago

Sadly, I don't think that is the solution.

The initial error (https://github.com/DeterminateSystems/nix-installer/issues/400#issue-1652762538) used a curl downloaded version of the installer.

During uninstall we do move ourselves out of the /nix folder:

https://github.com/DeterminateSystems/nix-installer/blob/1ae77d11ee176c755c62b4e7a31790d10df97c78/src/cli/subcommand/uninstall.rs#L61-L97

There are some cases where a user may manage to run a revert from a nix-installer inside the Nix store, and we should probably guard against those, but I am not sure that in this specific case that is the issue.

seven1m commented 1 month ago

I ran into the "bus error" described here when trying to uninstall on a Mac. ~My solution was to copy the installer somewhere where it won't get unmounted:~

cp /nix/nix-installer /tmp/nix-installer
/tmp/nix-installer uninstall

[edit:] I just did another uninstall using this technique, and I'm still getting the bus error. Disregard this advice! 🤦‍♂️