Closed fredleb closed 4 years ago
please try the following and report back:
Thanks for your answer.
I haven't tried yet but I'm pretty sure it won't work like that: I left out a couple of details thinking it would make things simpler but...
Now this is the real thing: the root filesystem is inside an lvm container...
I have basically 2 problems to solve here:
But maybe I am missing a point so I will try your solution later today.
Oh yeah and to make things simple: this is a headless remote server of course... so I need dropbear to give the password to "keys" (but that's not an issue)...
Now this is the real thing: the root filesystem is inside an lvm
oh, man, you are cheating with your symptoms :-)
ok, then your setup is conceptually similar to "sysroot-on-overlay" https://github.com/jdub/openwrt-systemd/tree/master/systemd/files/etc/systemd/system
so next thing to try is to provide your own sysroot.mount
and root-keys.mount
with inter-dependency on lvm2-pvscan.service
or whatever gets your array ready
https://git.archlinux.org/svntogit/packages.git/tree/trunk/sd-lvm2_install?h=packages/lvm2
remember Mount units may either be configured via unit files, or via /etc/fstab
https://www.freedesktop.org/software/systemd/man/systemd.mount.html
finally, try this in vm first
Phase 1: success !
It took me a while to figure out a couple of things but I got working the first phase.
So what works:
My [initramfs]/etc/crypttab (really /etc/mkinitcpio-systemd-tool/config/crypttab) looks like this:
root-keys.mount:
sysroot.mount:
Note: because these are the 2 file systems that I need to mount to boot, and they are mounted via explicit mount units, my [initramfs]/etc/fstab (really /etc/mkinitcpio-systemd-tool/config/fstab) is now empty.
And taaadaaa, it boots.
In the next comment, I will write what does not work and a probably unrelated failure that I can see.
I see a problem after I enter the password
switch to console, journalctl -b
?
Nothing in it... weird...
looks like still timing problem, where is lvm
array dependency in the units?
Yeap... I'm looking into how to solve that now...
this lvm dependency issue is already solved here:
Hmmm... several problems with that path...
Is it evil to have a service unit polling ?
BTRFS tools allow me to know if a mountable FS is missing any device... I could use that to avoid hard coding any thing by simply polling it until it says it foudn all devices...
Is it evil to have a service unit polling ?
no, we already use similar hacks:
but @fredleb Frederic, remember: you have one more lvm
client: @Anty0 :-)
Yeah, right.
Now I am lost again. I have modified my sysroot.mount to:
sysroot.mount
And added a service wait-for-sysroot.service that I hope will wait until all crypto stuff is done and then make sure all LVM stuff is activated:
wait-for-sysroot.service
Notes:
The above combination gives:
with all the stuff that is commented out,
it is hard to match the boot graph to the actual config state at the time,
so I guess you will want to experiment more and produce set of pairs config -> graph
:-)
meanwhile, general observations:
probably bug : [Install] RequiredBy=some.mount is ignored
what do you mean, exactly (1, 2a, 2b)?
remember: you can look / navigate inside /boot/initramfs-linux.img
RequiredBy=sysroot.mount <- this is ignored by mkinitcpio
every XxxxBy
entry is a reverse dependency,
which can be expressed as an override on the opposite side, use it to "fix the bug"
The above combination gives:
if you literally mean (all the units with commented out stuff -> graph in the previous comment) then the config and the graph do match as expected :-)
Now I am lost again.
please don't be! :-) it is apparent by now that more patience will be required
you may also want to play with stronger dependency
via BindsTo
, similar to: initrd-sysroot-mount.service
I can confirm that wait-for-sysroot.serivce containing a "Install/RequiredBy=sysroot.mount" creates a dir "sysroot.mount.requires" with a proper symlink to the service when enabled but that the mkinitcpio run is NOT copying the dir and link to the initramfs.
I check if I can fix it. Or is it wanted that way ?
please:
I created wait-for-sysroot.service in [real]/etc/systemd/system with:
[Install]
RequiredBy=sysroot.mount
Enabling it I see a [real]/etc/systemd/system/sysroot.mount.requires as expected with a symlink to my service in it. after a daemon-reload (not sure I need that) and mkinitcpio -p linux, the directory and the link are missing in [initramfs]/etc/systemd/system. ( The .wants are there as expected) I check in [initramfs]/usr/lib too and there is nothing (that would have been weird). no /run in initramfs: I checked from inside the real system...
Or did you mean something else ?
I see. I am making a unit test for that https://github.com/random-archer/mkinitcpio-systemd-tool/tree/master/tool/image
just noticed: ExecStart=pvscan --cache -aay
should fail, need abs path
@fredleb please review: #43
I have tried with forward and reverse dependency to sysroot.mount and in both cases I get the exact opposite of what I want... I strongly suspect systemd itself now...
Is there anyway to force systemd:
Is there any difference between the systemd executable for initramfs and for the real system ?
I'm gonna get a beer now.
I strongly suspect systemd itself now...
that is by design, next stage is melancholy - see systemd as tragedy ;-)
I get the exact opposite of what I want...
so the solution then is simple: configure units to the exact opposite to what you have now :-)
to trace the exact path of the unit it executes
enable logging, collect humongous log dump, then grep
them
systemd-analyze dot with filtering, though it can not do much analyze.c
force systemd to keep the generated units
you mean, like transient mounts? there is a systemd dbus api which can receive triggered signals with unit state snapshot, but I do not know any user-ready tools that expose that
to have some kind of dry run
sure, just run in QEMU or VirtualBox
it is hard to debug initramfs-linux.img
in a "simple container"
like docker
or systemd-nspawn
, since the is no cgroups
virtualization for udevd
and you need real hardware emulation/virtualization
difference between the systemd executable for initramfs and for the real system
no: extract and compare
initramfs-linux.img/init vs 1570680 Mar 27 14:44 /usr/lib/systemd/systemd
I'm gonna get a beer now.
me too!
in summary:
basically, available tools (systemd.log_level
, systemd analyze
) are fine
when you go from a working state to a broken one, so you can compare before and after
they are useless when you start in a broken state, when you are swamped with deluge of mindless details
what worked in the past was to fire up a vm with simple config and then keep building it up until it breaks
remember: there is nothing wrong with systemd
, it is just a tragedy
So I tried a couple of things and now that my virtual machine is stuck and won't boot, I think that I will need to have a look at the systemd code. Somehow, I have the feeling that mount units disregard Before and Requires constraints as soon as the device in What becomes available. If that's the case, I guess it's a bug ? I am giving up for today...
I have the feeling that mount units disregard
over last 5 years that changed a couple of times in systemd
, I do not bother "to understand" anymore, I just look for workarounds
take a look on the story around #10, #12, initrd-sysroot-mount.service to see if that rings a bell with what you feel
here is systemd-analyze
from a small working test system, based on recommended activation
where both dot --order
and dot --require
sort of make sense, individually,
but dot <total>
- not so much
one reason could be that -.mount
and sysroot.mount
are in fact,
same entity, just on different sides of switch-root
initrd-sysroot-mount.service
for use by 98dracut-systemdFYI: now #48 brings basic qemu support for testing
FYI: another thing to try: https://github.com/systemd/systemd/blob/master/NEWS
now my crypttab is no longer copied in the initramfs for some reason... What should be the minimal conditions to have it pulled ?
the minimal conditions
things are relocated now:
Back. So I restarted from pretty much scratch and now I see that enabling initrd-cryptsetup.path is not actually pulling initrd-cryptsetup.service at all...
I added manually a symlink to initrd-cryptsetup.service in etc/systemd/system/sysinit.target.wants/ but that does not change anything.
with latest release?
we have unit-test for that :-)
Yeah latest release. I actually reinstalled all arch packages...
actually pulling initrd-cryptsetup.service
you search like this, right?
mkdir -p /tmp/initrd
cd /tmp/initrd
lsinitcpio -x /boot/initramfs-linux.img
ls -las /tmp/initrd/usr/lib/systemd/system/initrd-cryptsetup.service
cat /tmp/initrd/usr/lib/systemd/system/initrd-cryptsetup.service
ls -las /tmp/initrd/etc/systemd/system/sysinit.target.wants/initrd-cryptsetup.path
symlink to initrd-cryptsetup.service in etc/systemd/system/sysinit.target.wants/
this should not be done, see src/initrd-cryptsetup.path
# note:
# this is a twin unit for initrd-cryptsetup.service
# enable only initrd-cryptsetup.path, initrd-cryptsetup.service is activated on demand
Nearly:
lsinitcpio -l /boot/initramfs-linux.img | grep "crypt"
Could it be cause by the fact that I am in a chrooted env ?
I boot my broken system via the archiso, and then arch-chroot in the decrypted and mounted system...
But that was working 10 days ago...
lsinitcpio -l /boot/initramfs-linux.img | grep "crypt"
do you see problems during mkinicpio -v ... > build.log
?
Could it be cause by the fact that I am in a chrooted env ?
try with systemd-nspawn
instead: wiki/System-Recovery
No problem in the log. I think I was the problem again... I ran a mkinitcpio from systemd-spawn without success and then out of despair I reinstalled the mkinitcpio-systemd-tool, rebuilt, and here they are. So I guess I had put my finger in the /usr/lib/systemd/system stuff Pffff...... sorry to waste your time.. and mine... I'm gonna take a break now but I hope to solve this LVM stuff this week-end. Thanks for your help !
I think I was the problem again...
seems like low beer level, yep, definitely :-)
Here is a detailed log of the problem...
I have trimmed it and replaced the \x2d with - for readability.
Clearly what I see here is that the 3 identical uuids are confusing systemd... So the problem is not with your tool but with systemd and btrfs. I have no idea how to solve that. I would imagine that systemd would be the one responsible for handling that and not udev. What would you recommend I go from there ? Should I write a ticket in systemd or a question in superuser.com ?
it says Apr 02
- you sure looking at the current problem?
Yeah. It's a virtual machine, I did not bother with loads of config because it is only there to reproduce this problem on something faster than the real machine.
assuming you had it working in the past, this symptom could be:
perhaps wrong level uuid in crypttab?
/usr/bin/mount /dev/disk/by-uuid/52a7433c-3914-422c-8d71-c016badb8c81 /sysroot -t btrfs -o noatime,x-systemd.device-timeout=9999h
this make it look like "higher level block device uuid"
BTRFS: device fsid 52a7433c-3914-422c-8d71-c016badb8c81
hmmm...
It never really worked properly.
I mean it boots, but there are errors that don't crash the booting on the virtual machine but do on the real one. The real one has slow mechanical HDDs whereas the virtual ones are all on the same ssd of my workstation.
What works:
But the first time the shared uuid triggers the mount, at least one of the 3 partitions is missing...
UUID is correct.
the uuid id definitely "higher level", and udev/systemd definitely must handle that, only how? :-)
these guys look healthy:
Apr 02 11:24:10 chaos systemd[1]: dev-mapper-root00vg-root.device: Changed dead -> plugged
Apr 02 11:24:10 chaos systemd[1]: dev-mapper-root01vg-root.device: Changed dead -> plugged
Apr 02 11:24:10 chaos systemd[1]: dev-mapper-root02vg-root.device: Changed dead -> plugged
perhaps go back to explicit unit dependency [all-3-of: dev-mapper-rootXX] -> sysroot ?
basically, looks like missing one more level of indirection in fstab
I have the following setup:
I got that setup working before this project became FHS compliant with an ugly hack (a call to mount) in initrd-shell.sh.
That was evil.
Now I am trying to get it to work with systemv.
My idea is that I can achieve the same thing using a systemd service that looks like this:
The problem however is that this fails to build with the following error
==> ERROR: unit not found: dev-mapper-keys.device
My question is:
Thanks