mitchellh / nixos-config

My NixOS configurations.
https://twitter.com/mitchellh/status/1346136404682625024
MIT License
2.05k stars 196 forks source link

bootstrap stalls and a restart login says root password is incorrect #33

Open TimJSwan89 opened 1 year ago

TimJSwan89 commented 1 year ago

[update: you may disregard this comment since the third comment has more reproducible detail]

I followed the instructions with a UTM setup on an M1 macbook. I can run git clone and ping google, so there doesn't seem to be network issues. ifconfig gave me an ip address which I used in the NIXADDR environment variable as instructed. After running make vm/bootstrap0 successfully and ensuring the project was ready again to run the second script, I ran make vm/bootstrap and it had an ssh error. ssh: connect to host 192.164.64.4 port 22: Connection refused rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.5] Note: I don't have clipboard sharing functioning so it's just retyped from what I see. I rebooted the OS and it asks nixos login: If I enter root it asks Password: and root prompts a response Login incorrect I have attempted empty fields for login and password already.

TimJSwan89 commented 1 year ago

Even though I ran sudo su then passwd and set it to root in both boot sequences, I cannot log in to the VM after it stops after running make vm/bootstrap regardless of which generation I select.

TimJSwan89 commented 1 year ago

To replicate,

  1. I start UTM on an M1 macbook with 4GB RAM and 64GB storage.
  2. I use nixos-minimal-22.05.4694.380be19fbd2-aarch64-linux.iso to boot after changing the other drive from VirtIO to NVMe.
  3. I run the default installer, the one without any parenthesis note. sudo su passwd root root
  4. nix-shell -p git gnumake openssh
  5. git clone https://github.com/mitchellh/nixos-config cd nixos-config
  6. ifconfig observe that enp0s1 has inet of 192.168.64.6 for this instance of a VM.
  7. export NIXADDR=192.168.64.6 export NIXNAME=vm-aarch64-utm
  8. make vm/bootstrap0 root at the password prompt. Wait for the script for a while, it reboots.
  9. Select shutdown after the reboot. Clear the CD/DVD drive option for the VM.
  10. Start up the VM again.
  11. Log in with root root
  12. sudo su passwd root root (not sure if this step is necessary)
  13. sudo -E nix-shell -p git gnumake openssh
  14. git clone https://github.com/mitchellh/nixos-config cd nixos-config
  15. Double check ifconfig to be sure the ip is the same as last time. export NIXADDR=192.168.64.6 export NIXNAME=vm-aarch64-utm
  16. make vm/bootstrap root root password entered twice as the script prompts twice.
  17. After a few minutes, it seems to stall on restarting the following units: dhcpcd.service, sshd.service, systemd-journald.service it may be relevant that there are a few interesting lines above that including updating systemd-boot from 250.9 to 253.5 and Failed to stop -.mount: Job type stop is not applicable for unit -.mount.
  18. After a certain amount of time, I get tired of waiting and I ctrl+C into the shell. journalctl --reverse | less -N produces most recent logs with:
    Jul 05 13:43:55 dev tailscaled[28989]: bootstrapDNS("derp4d.tailscale.com”, "2a03:b0c0:3:d0::1501:b001") for "log.tailscale.io” error: Get "https://derp4d.tailscale.com/bootstrap-dns?q=log.tailscale.io”: dial tcp [2a03:b0c0:3:d0::1501:b001]:443: connect: network is unreachable
    Jul 05 13:43:55 dev tailscaled[28989]: trying bootstrapDNS(“"derp4d.tailscale.com”, "2a03:b0c0:3:d0::1501:b001") for "log.tailscale.io" ...
  19. reboot
  20. 2 nixos generations are shown and I cannot log into either of them because the generation 2 just seems to stall indefinitely and generation 1 says that root root is Login incorrect
TimJSwan89 commented 1 year ago

I checked journalctl per a chatbot's recommendation after step 18 but before step 19. It looks like there is an issue with network connection to some tailschale.io I had to use a tool to convert some of the log screenshot into text and fix it manually since there is no shared clipboard.

journalctl --reverse | less -N

Jul 05 13:43:55 dev tailscaled[28989]: bootstrapDNS("derp4d.tailscale.com”, "2a03:b0c0:3:d0::1501:b001") for "log.tailscale.io” error: Get "https://derp4d.tailscale.com/bootstrap-dns?q=log.tailscale.io”: dial tcp [2a03:b0c0:3:d0::1501:b001]:443: connect: network is unreachable
Jul 05 13:43:55 dev tailscaled[28989]: trying bootstrapDNS(“"derp4d.tailscale.com”, "2a03:b0c0:3:d0::1501:b001") for "log.tailscale.io" ...
supermarin commented 1 year ago

Commenting here instead of opening a new issue: the "official" VMWare installation is broken ATM as well, mainly due to recommendd SCSI drives being /dev/sda, and NVME being /dev/nvme0n1. I've quickly poked around yesterday with NVME, but ran into more problems like nixos-generate-config creating wrong boot configuration, and not being able to boot into installed OS with EFI.

Now that NixOS has working GUI installer, I wonder if it would be better to record a new video and just use GUI install. That would take care of the bootstrap0 phase, then vm/bootstrap would be run over vanilla install.

mitchellh commented 1 year ago

I just bootstrapped from scratch again on my own on both my Mac and Windows and it worked fine on both. I did update the README and Makefile to revert back to SATA (unrelated to this issue) and I updated some syntax for NixOS 23.05. In either case, I'm unsure if it was consequential to do this issue but I can't reproduce any issues.

TimJSwan89 commented 1 year ago

I just bootstrapped from scratch again on my own on both my Mac and Windows and it worked fine on both. I did update the README and Makefile to revert back to SATA (unrelated to this issue) and I updated some syntax for NixOS 23.05. In either case, I'm unsure if it was consequential to do this issue but I can't reproduce any issues.

Did you use UTM? Maybe there is something different with the nix iso I used or even my mac specs. I have a 2020 M1 with 8GB RAM. Is there a way for me to troubleshoot or test the networking issue related to the tailscaled issue?

mitchellh commented 1 year ago

Did you use UTM? Maybe there is something different with the nix iso I used or even my mac specs. I have a 2020 M1 with 8GB RAM. Is there a way for me to troubleshoot or test the networking issue related to the tailscaled issue?

I used VMware Fusion. I haven’t tried UTM in a long while. Maybe it’s that. This was on a 2020 M1 MacBook Pro. I think there is likely some networking difference in UTM or something. I think its at least still valuable knowing that a clean bootstrap on VMware on similar hardware is working.

supermarin commented 1 year ago

@mitchellh are you on VMWare Player or Pro? Just realized that my vm did't boot with EFI at all, and there isn't any mentioning of UEFI in the vm settings (gui). Haven't tried forcing it by hand in .vdx. Is it possible that VMWare 13 Player doesn't support booting w/ UEFI?

mitchellh commented 1 year ago

@mitchellh are you on VMWare Player or Pro? Just realized that my vm did't boot with EFI at all, and there isn't any mentioning of UEFI in the vm settings (gui). Haven't tried forcing it by hand in .vdx. Is it possible that VMWare 13 Player doesn't support booting w/ UEFI?

Workstation pro but I think you can set some vmx settings to get UEFI. If you can figure out the Nix config to support EFI I’d be happy to document it but I couldn’t get that working with NixOS at all.

supermarin commented 1 year ago

@mitchellh thanks! I think it's better to have one true working config, both for the users and sake of your time maintaining this.

Just looked into the .vmx VMWare Player created, and indeed there's firmware=bios in there. Then I created a brand new vm, checked the vmx and firmware is efi in there. The only conclusion I can draw from this is my own stupidity of either overriding an existing vm, or missing the checkbox in initial VM creation. Went through the whole setup and can confirm that this part works as intended after your /dev/sda changes.

The bootstrap did get stuck at the same point @TimJSwan89's referring to. This is due to the hardcoded mitchellh user and password in here.

...redacted
the following new units were started: docker.service, docker.socket, home-manager-mitchellh.service, host.mount, network-addresses-ens33.service, nix-daemon.service, run-vmblock\x2dfuse.mount, sys-devices-virtual-net-docker0.device, sys-devices-virtual-net-tailscale0.device, sys-subsystem-net-devices-docker0.device, sys-subsystem-net-devices-tailscale0.device, systemd-fsck@dev-disk-by\x2dlabel-boot.service, tailscaled.service, vmware.service
make[1]: Leaving directory '/Users/marin/nixos-config'
make vm/secrets
make[1]: Entering directory '/Users/marin/nixos-config'
# GPG keyring
rsync -av -e 'ssh -o PubkeyAuthentication=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' \
    --exclude='.#*' \
    --exclude='S.*' \
    --exclude='*.conf' \
    /Users/marin/.gnupg/ mitchellh@172.16.174.129:~/.gnupg
Warning: Permanently added '172.16.174.129' (ED25519) to the list of known hosts.
(mitchellh@172.16.174.129) Password:

It's totally ok if you want to keep this repo working for you; just thought it would be awesome if newcomers could see a working i3 session like in your video. Do you think having an installer user that's in wheel would be an option, or maybe changing the initial password for mitchellh to hunter2 or whatever? The other option would be to document changing users in nixos.nix.

multivac61 commented 1 year ago

In the YouTube video, which is excellent by the way, you mention that the hard disk should be SCSI. In my experience that does not result in a file called /dev/sda and the bootstrap0 step failes.

I do not know what has changed but selecting SATA instead did the trick 😉

Nebuchadrezzar commented 1 year ago

I hit the same roadblock as @TimJSwan89, and concur with @supermarin that a default user (or prompting for a user/ password) would be appreciated.

gongqian commented 1 year ago

[update: you may disregard this comment since the third comment has more reproducible detail]

I followed the instructions with a UTM setup on an M1 macbook. I can run git clone and ping google, so there doesn't seem to be network issues. ifconfig gave me an ip address which I used in the NIXADDR environment variable as instructed. After running make vm/bootstrap0 successfully and ensuring the project was ready again to run the second script, I ran make vm/bootstrap and it had an ssh error. ssh: connect to host 192.164.64.4 port 22: Connection refused rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.5] Note: I don't have clipboard sharing functioning so it's just retyped from what I see. I rebooted the OS and it asks nixos login: If I enter root it asks Password: and root prompts a response Login incorrect I have attempted empty fields for login and password already.

The issue seemed to relate to tailscale, I resolve the issue by commenting out the tailscale config in machine/vm-share.nix as i don't use it at all.

services.tailscale.enable = true; // to comment out

and you will have to replace with your own hashed password and other credentials, which i haven't completely done yet. With my own hash password, i was able to get installation going and have nixos up and login as default "mitchellh" user

peteringram0 commented 8 months ago

Also getting this issue with asking for password (after setting in the VM) and getting incorrect.

@gongqian Thank for your comment. What type of hashing is used for the password if i want to replace it?

Rudra1106 commented 4 months ago

@mitchellh are you on VMWare Player or Pro? Just realized that my vm did't boot with EFI at all, and there isn't any mentioning of UEFI in the vm settings (gui). Haven't tried forcing it by hand in .vdx. Is it possible that VMWare 13 Player doesn't support booting w/ UEFI?

Workstation pro but I think you can set some vmx settings to get UEFI. If you can figure out the Nix config to support EFI I’d be happy to document it but I couldn’t get that working with NixOS at all.

i think you should recheck this issue of root password sir , many folks are facing this same problem .