minershive / hiveos-pxe-diskless

Network boot for diskless rigs
38 stars 26 forks source link

Latest release can't create file system #25

Closed snoby closed 10 months ago

snoby commented 10 months ago

In the new 6.5.4 release the create root file system fails I used set -x to turn on some commands in the bash script.

[07/01/2024 18:53:01][DEPLOY_PXE] Create Root Filesystem........................++ grep -iE '^country:'
++ awk '{print tolower($2)}'
+++ curl -s ifconfig.me
++ tr -d '\n'
++ whois 71.56.xxx.16
+ cc=us
+ [[ ! -z us ]]
+ url=http://us.archive.ubuntu.com/ubuntu/
+ debootstrap --arch=amd64 --include=sudo,curl,wget,systemd,initramfs-tools,net-tools,pv,tar,xz-utils focal /pxeserver/build/ubuntu20/_fs http://us.archive.ubuntu.com/ubuntu/
+ [[ 0 -ne 0 ]]
+ touch /pxeserver/build/ubuntu20/_fs/boot/debootstrap
+ echo_ok
+ echo -e '\033[0;32m[OK]\033[0m'
[OK]
+ set +x
[07/01/2024 18:55:01][DEPLOY_PXE] Mount needed folders (dev|proc|run|sys).......[OK]
[07/01/2024 18:55:01][DEPLOY_PXE] Add repo source.list..........................[OK]
[07/01/2024 18:55:01][DEPLOY_PXE] Upgrade FS....................................+ chroot /pxeserver/build/ubuntu20/_fs apt -y update
+ [[ 0 -ne 0 ]]
++ chroot /pxeserver/build/ubuntu20/_fs apt list --upgradable
++ grep '^hive' -c
+ pkg=0
+ chroot /pxeserver/build/ubuntu20/_fs apt -y upgrade
+ [[ 100 -ne 0 ]]
+ echo_fail
+ echo -e '\033[0;31m[FAIL]\033[0m'
[FAIL]
+ exit 1

So then i tried to run the commands myself

root@remote1:/pxeserver# chroot /pxeserver/build/ubuntu20/_fs apt -y upgrade
Reading package lists... Done
Building dependency tree... Done
Calculating upgrade... Done
The following NEW packages will be installed:
  distro-info python-apt-common python3-apt
The following packages will be upgraded:
  apt apt-utils base-files bash bsdutils busybox-initramfs ca-certificates cpio curl dbus distro-info-data dpkg e2fsprogs fdisk gcc-10-base gir1.2-glib-2.0 gpgv gzip initramfs-tools initramfs-tools-bin initramfs-tools-core isc-dhcp-client
  isc-dhcp-common klibc-utils kmod less libapparmor1 libapt-pkg6.0 libasn1-8-heimdal libblkid1 libbrotli1 libc-bin libc6 libcap2 libcap2-bin libcom-err2 libcryptsetup12 libcurl4 libdbus-1-3 libdns-export1109 libelf1 libexpat1 libext2fs2 libfdisk1
  libfribidi0 libgcc-s1 libgcrypt20 libgirepository-1.0-1 libglib2.0-0 libglib2.0-data libgmp10 libgnutls30 libgssapi-krb5-2 libgssapi3-heimdal libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhogweed5 libhx509-5-heimdal libicu66
  libip4tc2 libisc-export1105 libjson-c4 libk5crypto3 libkeyutils1 libklibc libkmod2 libkrb5-26-heimdal libkrb5-3 libkrb5support0 libldap-2.4-2 libldap-common liblz4-1 liblzma5 libmount1 libncurses6 libncursesw6 libnetplan0 libnettle7 libnghttp2-14
  libnss-systemd libp11-kit0 libpam-cap libpam-modules libpam-modules-bin libpam-runtime libpam-systemd libpam0g libpcre2-8-0 libpcre3 libprocps8 libpython3.8-minimal libpython3.8-stdlib libroken18-heimdal libsasl2-2 libsasl2-modules-db libseccomp2
  libsepol1 libsmartcols1 libsqlite3-0 libss2 libssh-4 libssl1.1 libstdc++6 libsystemd0 libtinfo6 libudev1 libuuid1 libwind0-heimdal libxml2 libxtables12 libzstd1 linux-base locales login logsave lz4 mount ncurses-base ncurses-bin netplan.io
  networkd-dispatcher openssl passwd perl-base procps python3-pkg-resources python3-yaml python3.8 python3.8-minimal rsyslog sudo systemd systemd-sysv systemd-timesyncd tar tzdata ubuntu-advantage-tools ubuntu-keyring ubuntu-minimal udev util-linux
  vim-common vim-tiny wget xxd xz-utils zlib1g
148 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 151 kB/54.9 MB of archives.
After this operation, 3111 kB of additional disk space will be used.
Get:1 http://mirrors.ubuntu.com/mirrors.txt Mirrorlist [3408 B]
Get:2 http://mirrors.accretive-networks.net/ubuntu focal-updates/main amd64 dbus amd64 1.12.16-2ubuntu2.3 [151 kB]
Fetched 155 kB in 1s (143 kB/s)
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),

So i think in the chroot that locale is not being set

root@remote1:~# chroot /pxeserver/build/ubuntu20/_fs/ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

So i went in and in the chroot i set LANG to en_US.UTF-8 in /etc/locale.gen then generated the files. wiht locale-gen

Even after all that upgrading the FS failed.

root@remote1:/pxeserver# ./deploy_pxe ubuntu20 --upgrade
dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory

[08/01/2024 17:52:23][DEPLOY_PXE] Mount needed folders (dev|proc|run|sys).......[OK]
[08/01/2024 17:52:23][DEPLOY_PXE] Upgrade FS....................................[FAIL]
[08/01/2024 17:54:30][DEPLOY_PXE] Upgrade FS....................................+ chroot /pxeserver/build/ubuntu20/_fs apt -y update
+ [[ 0 -ne 0 ]]
++ chroot /pxeserver/build/ubuntu20/_fs apt list --upgradable
++ grep '^hive' -c
+ pkg=0
+ chroot /pxeserver/build/ubuntu20/_fs apt -y upgrade
+ [[ 0 -ne 0 ]]
+ [[ 0 -gt 0 ]]
+ [[ 1 -ne 0 ]]
+ echo_fail
+ echo -e '\033[0;31m[FAIL]\033[0m'
[FAIL]
+ exit 1
villos commented 10 months ago

Fixed in 6.5.5 . Run hive-upgrade.sh or clean install.

snoby commented 10 months ago

What host version do you expect? Because this seems very broken.


root@remote1:/pxeserver# ./deploy_pxe ubuntu20 --build
dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory

[10/01/2024 21:03:44][DEPLOY_PXE] Create Root Filesystem........................[OK]
[10/01/2024 21:06:26][DEPLOY_PXE] Mount needed folders (dev|proc|run|sys).......[OK]
[10/01/2024 21:06:26][DEPLOY_PXE] Add repo source.list..........................[OK]
[10/01/2024 21:06:26][DEPLOY_PXE] Upgrade FS....................................[OK]
[10/01/2024 21:09:09][DEPLOY_PXE] Install additional packages...................[OK]
[10/01/2024 21:09:25][DEPLOY_PXE] Compile locales...............................[OK]
[10/01/2024 21:09:27][DEPLOY_PXE] Install rtl_nic firmwares ....................[OK]
[10/01/2024 21:09:27][DEPLOY_PXE] Configure FS..................................[OK]
[10/01/2024 21:09:28][DEPLOY_PXE] Install Hiveon package........................[OK]
[10/01/2024 21:10:48][DEPLOY_PXE] Install linux kernel..........................cp: cannot stat 'kernel/*.deb': No such file or directory
[OK]
[10/01/2024 21:10:48][DEPLOY_PXE] Copy initramfs config.........................[OK]
[10/01/2024 21:10:48][DEPLOY_PXE] Create initramfs image........................W: missing /lib/modules/5.15.0-91-generic
W: Ensure all necessary drivers are built into the linux image!
depmod: ERROR: could not open directory /lib/modules/5.15.0-91-generic: No such file or directory
depmod: FATAL: could not search modules: No such file or directory
cat: /var/tmp/mkinitramfs_W6W7uG/lib/modules/5.15.0-91-generic/modules.builtin: No such file or directory
depmod: WARNING: could not open modules.order at /var/tmp/mkinitramfs_W6W7uG/lib/modules/5.15.0-91-generic: No such file or directory
depmod: WARNING: could not open modules.builtin at /var/tmp/mkinitramfs_W6W7uG/lib/modules/5.15.0-91-generic: No such file or directory
[OK]
[10/01/2024 21:11:03][DEPLOY_PXE] Create symlink................................cp: cannot stat '/pxeserver/build/ubuntu20/_fs/boot/vmlinuz-': No such file or directory
[OK]
[10/01/2024 21:11:03][DEPLOY_PXE] Clean FS......................................[OK]
[10/01/2024 21:11:03][DEPLOY_PXE] Umount needed folders (dev|proc|run|sys)......[OK]
[10/01/2024 21:11:03][DEPLOY_PXE] Directory size: 1558M.........................[OK]
[10/01/2024 21:11:03][DEPLOY_PXE] Saving to build/ubuntu20/ubuntu20.tar.xz .....[OK]
[10/01/2024 21:17:10][DEPLOY_PXE] Create symlink ...............................[OK]
root@remote1:/pxeserver# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"
root@remote1:/pxeserver# ./deploy_pxe ubuntu20 --initrd
dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory

[10/01/2024 22:02:10][DEPLOY_PXE] Copy initramfs config.........................[OK]
[10/01/2024 22:02:10][DEPLOY_PXE] Create initramfs image........................W: missing /lib/modules/5.15.0-91-generic
W: Ensure all necessary drivers are built into the linux image!
depmod: ERROR: could not open directory /lib/modules/5.15.0-91-generic: No such file or directory
depmod: FATAL: could not search modules: No such file or directory
cat: /var/tmp/mkinitramfs_kRPNg0/lib/modules/5.15.0-91-generic/modules.builtin: No such file or directory
depmod: WARNING: could not open modules.order at /var/tmp/mkinitramfs_kRPNg0/lib/modules/5.15.0-91-generic: No such file or directory
depmod: WARNING: could not open modules.builtin at /var/tmp/mkinitramfs_kRPNg0/lib/modules/5.15.0-91-generic: No such file or directory
[OK]
[10/01/2024 22:02:25][DEPLOY_PXE] Create symlink................................cp: cannot stat '/pxeserver/build/ubuntu20/_fs/boot/vmlinuz-': No such file or directory
[OK]
root@remote1:/pxeserver# ls
build  build-focal.log  configs  deploy_pxe  hive-config  hiveramfs  hive-upgrade.sh  pxe-config.sh  server.conf  tftp  VER
root@remote1:/pxeserver# ./hive-upgrade.sh
Local version:  6.5.5
Remote version: 6.5.5
You package of Hiveos PXE server is up to date.
root@remote1:/pxeserver#
villos commented 10 months ago

dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory

you dont have kernel deb package it must be downloaded automaticaly via upgrade/setup script. check it

snoby commented 10 months ago

dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory

you dont have kernel deb package it must be downloaded automaticaly via upgrade/setup script. check it

root@remote1:/pxeserver# ./hive-upgrade.sh Local version: 6.5.5 Remote version: 6.5.5 You package of Hiveos PXE server is up to date.

i completely deleted the install and ran the setup from fresh...

./pxe-setup.sh

Destination directory: /pxeserver
Press ENTER to continue with this destination or type a new one

> Download PXE-server package
--2024-01-11 00:15:43--  https://github.com/minershive/hiveos-pxe-diskless/archive/master.zip
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/minershive/hiveos-pxe-diskless/zip/refs/heads/master [following]
--2024-01-11 00:15:43--  https://codeload.github.com/minershive/hiveos-pxe-diskless/zip/refs/heads/master
Resolving codeload.github.com (codeload.github.com)... 140.82.114.10
Connecting to codeload.github.com (codeload.github.com)|140.82.114.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘master.zip’

master.zip                                   [                <=>                                                                ]  21.90M  6.54MB/s    in 3.4s

2024-01-11 00:15:46 (6.54 MB/s) - ‘master.zip’ saved [22968447]

> Extract PXE-server package.Please wait

> Copy PXE-server package to destination directory.
Install  build-essential packages.
Please wait...
Done

Workers config
Type FARM_HASH: redeacted46676a1c
New FARM_HASH: a7e6redacted
Hive server URL: http://api.hiveos.farm
Press ENTER to continue with this URL or type a new one

++++++++++++++++++
Server config
Hive repo URL: http://download.hiveos.farm/repo/binary/
Press ENTER to continue with this URL or type a new one

Current server IP-address: 10.0.0.167
Press ENTER to continue with this IP-address or type a new one

TMPFS size: 3000 MB
Press ENTER to continue with this TMPFS size or type a new one (in MB)

Default dist: ubuntu20

Config complete
++++++++++++++++++
net.core.somaxconn = 65535
> Restart DNSMASQ server. OK
> Restart Nginx server. OK
> Restart Atftp server. OK
Netboot directory for x86_64-efi created. Configure your DHCP server to point to /pxeserver/tftp/efi/x86_64-efi/core.efi

Server ready to work

root@remote1:/home/snoby# cd /pxeserver/
root@remote1:/pxeserver# ls
configs  deploy_pxe  hive-config  hiveramfs  hive-upgrade.sh  pxe-config.sh  server.conf  tftp  VER
root@remote1:/pxeserver# ./hive-upgrade.sh
Local version:  6.5.5
Remote version: 6.5.5
You package of Hiveos PXE server is up to date.
root@remote1:/pxeserver#

What step did i miss?

root@remote1:/pxeserver# ./deploy_pxe --help
dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory
Usage:
  deploy_pxe ubuntu20 --build           create latest Ubuntu 20.04 image
  deploy_pxe ubuntu20 --selfupgrade     just upgrade Hive package and repack rootfs image
  deploy_pxe ubuntu20 --upgrade         upgrade all and repack rootfs image
  deploy_pxe ubuntu20 --chroot          chroot into rootfs (for manual actions)
  deploy_pxe ubuntu20 --initrd          rebuild initramfs image
  deploy_pxe ubuntu20 --remove          delete rootfs folder

Nvidia drivers:
  deploy_pxe nvidia --list              list available driver versions
  deploy_pxe nvidia --build <VER>       build driver specific version ( e.g. 515 or 515.105 or 515.105.01)

AMD OpenCL:
  deploy_pxe opencl --list              list available driver versions
  deploy_pxe opencl --build <VER>       build specific version (for now 5.4 only)

root@remote1:/pxeserver# ./deploy_pxe ubuntu20 --build
dpkg-deb: error: failed to read archive 'kernel/linux-image-*.deb': No such file or directory
villos commented 10 months ago

dont run local pxe-setup use remote script

wget https://raw.githubusercontent.com/minershive/hiveos-pxe-diskless/master/pxe-setup.sh && sudo bash pxe-setup.sh