danboid / ALEZ

Arch Linux Easy ZFS installer
GNU General Public License v3.0
146 stars 25 forks source link

ALEZ fails all install-variants on HP EliteBook 2730p | 2740p | 2760p #17

Closed claudiusraphaelpaeth closed 5 years ago

claudiusraphaelpaeth commented 5 years ago

I have tested ALEZ in the following configurations

1) out of Antergos Live (18.11|18.12) 2) out of ALEZ-arch[…].iso, as given in README

… on the following systems (all completely cleaned including zap of GPT and MBR and additionally overwriting first and last regions (as either setup by the installer or custom configured in forefront) with 0s after any failed attempt - deemed necessary, because ALEZ vomited and freezed on each take without):

a) HP EliteBook 2730p a1 – CPU Intel® Core 2 Duo CPU L9400 – RAM 4GiB (2*2) DDR-II – iGPU Intel 4500 MHD

b) HP EliteBook 2740p

b1 – CPU "Arrandale" Intel Core i5 520M vPro – RAM 4GiB (22) DDR-III – iGPU "Ironlake" b2 – CPU "Arrandale" Intel Core i5 540M vPro – RAM 8GB (24) DDR-III – iGPU "Ironlake"

c) HP EliteBook 2760p c1 – CPU "SandyBridge" Intel Core i7 2620M vPro – RAM 16 GiB (2*8) DDR-III – iGPU "HD 3000"

– AMT (disabled) – BIOS | UEFI – HDD – a1|b1|b2 – INT µSATA GB 80 Intel SSD 320 1.8" | 160 GB Toshiba 1.8" – HDD INT SATA – HDD EXT USB 2 IDE 2.5" + 3,5" GB : 40|80|120 – HDD EXT USB 3 with the following drives – EXT SATA (300|600) 2.5" + 3.5" GB : 40|80|160|320|1.5TB|2TB|3TB – EXT eSATA (via 2740p Docking Station + via PC-Card) same drives as listed before in EXT SATA – EXT FireWire 400 to SATA with the drives listed on EXT SATA – WLAN Intel a|b|g|n – WWAN a1 [Qualcomm Gobi 1000] – b1|b2 [Qualcomm Gobi 2000] | c1 [Erricson 5521 HSPA+] – BT BroadCom – LAN GigaBit Intel – SmartCard – PC-Card 54 – SDHC

WAN runs at ~ 3 MiB/s

… after some diddly daddly unstructured probing via Antergos Live [18.11 | 18.12]: I did 3 general configurations of the above abilities: – A – just what is needed ( means WWAN | WLAN | FireWire | PC-Card | eSATA | etc … disabled) – B – everything (, asides AMT and TXT) enabled – C – just what is needed but – C1 eSATA via Dock | C2 eSATA via PC-Card

Any given combination of ALEZ options tested:

What (reproducibly) happened was:

I – ALEZ crashed totally (including Freeze of the TTY) when booted via UEFI and selecting UEFI setup with GRUB on internal SATA HDD|SSD right at the moment it checks the existing partitioning, if it was not completely cleaned in forefront (as described above by total zap GPT+MBR and nullify ins and outs – a.k.a.: Yeah to kill ZFS created by ZoL only works that way; even when sweeping by overwriting with a FreeBSD or OpenIndiana complete-disk setup the via ZFSonLinux created pools are somehow surviving as phantoms - absolutely no idea why and where they (exactly) hide (because that differs - could be specific to the EliteBooks, as they even with AMT | DriveLock | Intel Anti Theft | TXT | SecureBoot, etc. disabled sometimes, magically make a jump back in time as if none of the actions have been taken.

II – ALEZ only starts downloading, in terms of actually preparing the install of the final system, if booted via BIOS, selecting either UEFI+GRUB with a maximum of 8063MiB for ESP|EF00 at sector 2048 or BIOS+GRUB (automatically 1 MiB at Sector 2048 to 4095 for EF02 bios_grub partition followed by max rest of HDD as hardcoded in ALEZ); Downloading at around 3MiB/s and installing the system takes between 30 to 45 minutes (which is a heck if i may say so); installation is always unbootable due to the fact that grub isn't effectively written, nothing there after reboot, as if grub was installed to an overlay of the real ESP|BIOS_GRUB).

All other variants (systemd instead of grub, either booted UEFI or BIOS, etc. makes no difference) fail, somewhere in between selecting auto-partition target| selecting prepared partition up to starting the download of the final system (breaks right at the beginning).

What all variants have had in common that though EXT-USB drives are shown and can be selected for auto-partitioning run up to the the point where in UEFI-setup choosing the partition (part2) for the system would have been or actually is asked (depending on the type of setup – auto-partition vs. custom prepared.

ALEZ breaks immediately if there is any media containing a zpool present.

Where ALEZ actually runs the whole install, the moment mkcpio (i hope i remember this right) processes whatever it does, ALEZ either breaks or if passing that point, mumbles about can not this can not that blabliblu … yep.

At some point (while trying to follow the output) i recognized that ALEZ attempted to check for a fsck.zfs and wonders wtf is going on; whereas i wonder, does such a tool exist?

ALEZ hicks up on downloading the system if used with a 5 GHz Ch. 36 WLAN 802.11n WPA2-PSK; also if used with a or b or through a second router on LAN, which seems related as in these cases the rate isn't handled at the rates that are possible (11|22|100 mbit), but that is most likely due to default settings of the network management.

If you got to this point, you might wonder why the f i am blubbering all that out?

Answer is: I did these tests over about 10 days, asides my work, so to say as an alternative to completely shutdown my brain whenever my focus got lost. Please keep that in mind.

I named the things that i noted while in that pause-state-off-of-the-real-work and bundled it to the conclusions that jumped into my dried-out eyeballs, so stark it might give you an idea of where to enable some try-and-catch-style-figurines in your concept.

If you like to inspect further, i am willed to test specific constellations, with feedback, log files whatever you need. Just create complete case-description, deliver the patched versions with clear instructions or even totally automated. For that i can for about two to three weeks one of the 2740p and the 2760p set on layback for experiments, but in a reduced setup without flipping all possible switches (which i did only because the disks i used have been marked for deletion, so any real-data overwrite was welcome). I am fine with internal HDD|SSD SATA 80|160|320 or EXTUSB SATA 40|80; which should be enough to run some specific variations, right?

By the way, all those systems work perfectly fine, with Antergos ZFS install, customized after installation for use with zedenv and/or beadm; also with Debian Jessie|Stretch || Ubuntu Xenial|Bionic|Cosmic || Manjaro || CentOS || Fedora; Older versions need some to a lot of tweaking (Python, pyenv, specific apt, etc.) but all worked in the end, with BIOS GPT and UEFI GPT as also both together as wholedisk-installs or on custom partitions, also those prepared with OpenIndiana and FreeBSD 11.1 and 11.2; What came to mind is that since ZoL included userobj_accounting feature-flag it has to be explicitly disabled on creation in e.g.: ubuntu cosmic and accompanied by posix-acl, else none of the systems worked with neither zedenv nor beadm; Maybe that's a point to consider?

And yes, you might have come to the conclusion that i am fine with all-things-debian, but clearly arch is to me like a shudder, somehow, don't even know why.

However, hope it helps transcending your minds so you feel the need to nail yourself on crossed woodstabs and smile eloquently or let your body drift through waves of snow finding yourself on a cuddly piece of flat earth in the northern hemisphere.

Hope it helps, somehow!

claudiusraphaelpaeth commented 5 years ago

Oops, accidentally closed; ahem, not mcpio, but mkinitcpio ... SRY!

danboid commented 5 years ago

Wow! I can't say I've read a bug report like that before!

If you can install Antergos on these machines, then ALEZ should also work too so it would be good if we can work out what your issue is here.

I have never tried installing Arch with or without ZFS onto a USB drive. I would recommend against doing that. I have also never tried running ALEZ from Antergos live but I'd imagine that should work as long as the correct zfs packages are installed and the zfs/spl modules are loaded.

"What came to mind is that since ZoL included userobj_accounting feature-flag it has to be explicitly disabled on creation in e.g.: ubuntu cosmic and accompanied by posix-acl"

ALEZ already does both of these (disables userobj_accounting and enables posix-acl) so I don't think that is your issue. Sorry I don't have any useful suggestions for you right now. Maybe @johnramsden might have a suggestion for you?

danboid commented 5 years ago

Are you trying to install to the internal SATA drive or are you trying to install to a USB drive?

danboid commented 5 years ago

I suspect whats going on here is you are choosing the wrong partition at one point so we might need to improve some of the dialog text / instructions.

claudiusraphaelpaeth commented 5 years ago

@danboid

Are you trying to install to the internal SATA drive or are you trying to install to a USB drive?

I know (in aftersight) it is confusing to read my 'bug report' … :)

Although i said i tested all variations, one could come to the conclusion i didn't as i merely list the options that are given based on the software [ ALEZ ] and the hardware [ 4 x HP EliteBook 27x0p ]; But i meant it, when i was writing i tested all variations:

I used ALEZ and ALEZ-arch*.iso exactly as I've described it, meaning: – all listed SATA drives HDD + SSD internally : [ GB | 40 | 80 | 120 | 160 | 320 ] externally via eSATA, so from the perspective of the system effectively internally : [ GB | 40 | 80 | 120 | 160 | 320 ] + [ TB 1.5 | 2 | 3 ]

And also externally via USB [ 2 | 3 | 3.1 ]

claudiusraphaelpaeth commented 5 years ago

I suspect whats going on here is you are choosing the wrong partition at one point so we might need to improve some of the dialog text / instructions.

Actually, for those that do read, i think your already implemented notes, like the hint that when using auto-partitioning the Install-Target-partition would be the one ending in …-part2 and following that up when asked to select the ESP (EFI System Partition) would be the one ending in …-part1, is more than enough.

Thinking about your question, it might be useful in general to give some short description of what will happen, when selecting this or that. – Hmm, i think this could be a good point in time to take a first look on the actual code. If something comes to mind i will open a new issue and reference this note.

claudiusraphaelpaeth commented 5 years ago

@danboid Although my 'testing' was unfocused i do actually have a good portion of insight into ZFS and its surrounding use-cases and software helping make use of it; but indeed as i commented right before, now the next step is to take a look at what code ALEZ actually consists of, to be of help.

Just to differ this, a main reason i did not dive into it, asides the INFORMAL note, until now, was (as i initially stated), also: – The fact that i am having a hard time to grasp the how and why Arch, does what it does in a slightly other way then most of Linux-Distributions. For example, without consulting the (very informative and well written for experienced users) Arch-Wiki – I am still not able to do a simple chroot out of my head into an Arch Linux from another Linux-Distribution and finalize the task of renewing GRUB; Maybe, because of the use of mkinitcpio or in general the way initramfs is handled, which seems to me personally a bit weird. – However i need to digg into these things, too the next weeks, so just looking at Arch Linux from the perspective of a not-used-to-Linux-in-any-way-user, will for sure help me to get it.

But back to topic …

Please, see my report as the possibility to harden the ALEZ-experience by having a participator (me) that is willed to do fine-grained, but also rough initial testing of existing and coming code. – I nearly all the time having at least one more EliteBook at hand, that i am using for necessary recreational soft-breaks at creative digital work, where i have no problem to just kill the system completely and reset my experience for that moment to a new-bourne, if i may say so.

That is why not me is actually in need to get ALEZ working, but me offering you a pair of hands and a brain to help you refining ALEZ. If you like to make use of it, you're welcome. I always do such fiddling aside, so if you think it can help, feel free to ask for any task.

johnramsden commented 5 years ago

So you have another zpool present? That might be the issue.

The pool ID is currently found with https://github.com/danboid/ALEZ/blob/73f608c7d810b5469d4ccf15c3b86ddf86bf6e9e/alez.sh#L393

With multiple pools this could grab the wrong one.

Could you see if changing that line to the following helps:

zpool import -d /dev/disk/by-id -R "${installdir}" "${zroot}" 
danboid commented 5 years ago

@claudiusraphaelpaeth

I have just uploaded a new release of the ALEZ installation ISO. Could you please try again with the new ISO and see if you have any more luck? Make sure you choose the systemd bootloader if you are installing on a UEFI machine.

@johnramsden

I've FINALLY got round to testing ALEZ on a UEFI machine but unfortunately the UEFI/GRUB mode doesn't work for me. That will be because the script currently modifies the GRUB config file the same way for both BIOS and UEFI but it would seem GRUB UEFI needs its own config. Using the systemd bootloader option works fine.

I have added instructions on how to create a new installation ISO to the repo.

danboid commented 5 years ago

Oh dear! It's not just uefi grub that's broken, grub doesn't install under bios mode either now. Also, the stable kernel got updated almost as soon as I uploaded the new iso so you may have to install the LTS if you're not testing tomorrow. I will have to add logging to help with troubleshooting.

danboid commented 5 years ago

I've not looked into it properly yet but I suspect grub isn't installing because it is being passed a disk ID instead of a device name?

danboid commented 5 years ago

For some reason the install_grub on line 463 never gets called (not worked out why yet) plus its inserting an extra '/dev' and then I think install_grub_efi() (on line 183) is missing a device to install to.

johnramsden commented 5 years ago

Strange, last time I tried it they both worked. I'll take a look next time I get the chance.

danboid commented 5 years ago

I have not tried doing another UEFI GRUB install since my latest commit but BIOS GRUB installs are working again now.

danboid commented 5 years ago

UEFI GRUB install might work now but I've not tested it yet.

danboid commented 5 years ago

There are still issues with the GRUB BIOS install. The current script works for me doing a BIOS/GRUB install under Arch qemu but it fails under Ubuntu 18.04 qemu with a grub-probe canonical path error so the old hack I thought we didn't need any more might be making a comeback :/

danboid commented 5 years ago

I have ALEZ GRUB BIOS installer working under older versions of virt-manager (1.X , as used in Ubuntu 18.04 which uses 1.5) now.

When creating a new ALEZ KVM VM under virt-manager, choose 'Customise configuration before install' then go to the CPU settings and enable 'Copy host CPU configuration' and click 'Apply' before clicking 'Begin Installation'.

I don't really understand why or how VM CPU config could cause issues for GRUB but it does! I should probably add this to the README and I need to test it on real hardware again now to check it works there.

danboid commented 5 years ago

Tonight I successfully used the latest ALEZ iso to install Arch ZFS under UEFI with GRUB, which means I can finally say I have now fully tested and verified ALEZ on real hardware in all three of its main configs.

With commit #72 earlier, ALEZ arrived for real!

danboid commented 5 years ago

I have updated the ALEZ readme to simplify the (previously quite rambling) usage instructions and added a note about GRUB failing to install under virt-manager/qemu if you have the wrong CPU settings. I've also added a link to ALEZ on https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_ZFS as it seems to be ready for wider testing now. Thanks for your help John!

johnramsden commented 5 years ago

I wonder if it should be so "front and center" on the wiki, Maybe something like:

See ZFS for installing the ZFS packages. If installing Arch Linux onto ZFS from the archiso, it would be easier to use the archzfs repository. There is also a interactive installer, ALEZ, which is an easy way to get a ZFS system up and running if you do not require much customization.

Thoughts?

danboid commented 5 years ago

I'd be happy enough with that. I think two things matter here, that it gets mentioned somewhere near the beginning of the ZFS installation page so that those who just want a basic ZFS config with minimum effort see it before getting scared off by the manual instructions and that we're upfront about ALEZ (mostly intentional) limitations.

I'm quite happy for you to edit it to something like what you wrote so long as it meets my recommendations and I'd say that does.

danboid commented 5 years ago

I'm going to close this now due to the lack of response but feel free to open another ticket if the latest iso and script still don't work for you @claudiusraphaelpaeth

danboid commented 5 years ago

ALEZ 1.0 has been released now with several fixes since this ticket was opened/closed.