Closed kkoehle closed 4 years ago
@kkoehle,
I tried rear from the git repo on a sles12 without any issue (on KVM not PowerVM)
I notice that you don't have POST_RECOVERY_SCRIPT
in your rear local.conf file.
Here is the part I usually add to my SLES12 servers.
## SLES12
#BACKUP_OPTIONS="nfsvers=4,nolock"
REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" snapper chattr lsattr )
COPY_AS_IS=( "${COPY_AS_IS[@]}" /usr/lib/snapper/installation-helper /etc/snapper/config-templates/default )
for subvol in $(findmnt -n -r -t btrfs | cut -d ' ' -f 1 | grep -v '^/$' | egrep -v 'snapshots|crash') ; do
BACKUP_PROG_INCLUDE=( "${BACKUP_PROG_INCLUDE[@]}" "$subvol" )
done
POST_RECOVERY_SCRIPT=( 'if snapper --no-dbus -r $TARGET_FS_ROOT get-config | grep -q "^QGROUP.*[0-9]/[0-9]" ; then snapper --no-dbus -r $TARGET_FS_ROOT set-config QGROUP= ; snapper --no-dbus -r $TARGET_FS_ROOT setup-quota && echo snapper setup-quota done || echo snapper setup-quota failed ; else echo snapper setup-quota not used ; fi' )
@schabrolles
I added the POST_RECOVERY_SCRIPT and no change. I don't imagine it is a KVM vs PowerVM thing that determines if a disk shows bootable. Does your SLES12 install use LVM and BTRFS for the boot disk?:
PowerPC Firmware
Version FW860.11 ### (SV860_063)
SMS (c) Copyright IBM Corp. 2000,2016 All rights reserved.
-------------------------------------------------------------------------------
Current Boot Sequence
1. Device is not bootable or removed.
2. None
3. None
4. None
5. None
What can I do to make PowerVM think it is bootable? Why is there no error message saying the bootloader didn't work? Is there anything I should check for?
@kkoehle,
agree, I don't think my KVM test could explain the difference. I think there is something wrong with grub2 but I never had this on my rear on Power test.
you can try to run rear -d recover
to have a debug log and check if something wrong happens during the grub installation.
usr/share/rear/finalize/Linux-ppc64le/620_install_grub2.sh
I won't be available from today to end of next week.
@jsmeix do you think it could be related to #2093
@kkoehle
use rear -d -D recover
(i.e. with -D
) to get a full debug log,
cf. "Debugging issues with Relax-and-Recover" at
https://en.opensuse.org/SDB:Disaster_Recovery
@schabrolles
https://github.com/rear/rear/issues/2093 looks totally unrelated
because there 1:1 restore for all fs and disks works fine
.
But what happens in this issue here is not a recovery on 100% compatible hardware because in https://github.com/rear/rear/files/2993248/rear.txt there is (excerpts)
Switching to manual disk layout configuration
Original disk /dev/mapper/36005076802818158a400000000000495 does not exist (with same size) in the target system
...
User confirmed disk layout file
Partition primary on /dev/mapper/36005076802818158a4000000000004c0: size reduced to fit on disk.
Doing SLES12-SP1 (and later) btrfs subvolumes setup because the default subvolume path contains '@/.snapshots/'
No code has been generated to recreate pv:/dev/mapper/36005076802818158a4000000000004a2 (lvmdev).
To recreate it manually add code to /var/lib/rear/layout/diskrestore.sh or abort.
Manually add code that recreates pv:/dev/mapper/36005076802818158a4000000000004a2 (lvmdev)
1) View /var/lib/rear/layout/diskrestore.sh
2) Edit /var/lib/rear/layout/diskrestore.sh
3) Go to Relax-and-Recover shell
4) Continue 'rear recover'
5) Abort 'rear recover'
(default '4' timeout 300 seconds)
1
No code has been generated to recreate pv:/dev/mapper/36005076802818158a40000000000049f (lvmdev).
To recreate it manually add code to /var/lib/rear/layout/diskrestore.sh or abort.
Manually add code that recreates pv:/dev/mapper/36005076802818158a40000000000049f (lvmdev)
1) View /var/lib/rear/layout/diskrestore.sh
2) Edit /var/lib/rear/layout/diskrestore.sh
3) Go to Relax-and-Recover shell
4) Continue 'rear recover'
5) Abort 'rear recover'
(default '4' timeout 300 seconds)
4
No code has been generated to recreate pv:/dev/mapper/36005076802818158a4000000000004a3 (lvmdev).
To recreate it manually add code to /var/lib/rear/layout/diskrestore.sh or abort.
Manually add code that recreates pv:/dev/mapper/36005076802818158a4000000000004a3 (lvmdev)
1) View /var/lib/rear/layout/diskrestore.sh
2) Edit /var/lib/rear/layout/diskrestore.sh
3) Go to Relax-and-Recover shell
4) Continue 'rear recover'
5) Abort 'rear recover'
(default '4' timeout 300 seconds)
4
Confirm or edit the disk recreation script
1) Confirm disk recreation script and continue 'rear recover'
2) Edit disk recreation script (/var/lib/rear/layout/diskrestore.sh)
3) View disk recreation script (/var/lib/rear/layout/diskrestore.sh)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)
2
***** At this point I change the line:
lvm lvcreate -L 64424509440b -n root system <<<y
****** to:
lvm lvcreate -L 55g -n root system <<<y
***** I do this because I am restoring a disk that was 100 GB to a 60 GB disk, and the 64424509440b is too big (especially since swap still needs 2 GB)
So the recovery hapens on a smaller disk and that is a migration that will for sure not just work.
I was wondering why that
Partition primary on /dev/mapper/36005076802818158a4000000000004c0: size reduced to fit on disk.
happened because that is a far too late "desperate" mesage from the create_partitions() function when it works aready with wrong data in disklayout.conf
I would have expected to see messages from layout/prepare/default/420_autoresize_last_partitions.sh because of
Device mapper!36005076802818158a400000000000495 does not exist (manual configuration needed)
Switching to manual disk layout configuration
we are in MIGRATION_MODE where in particular scripts like
layout/prepare/default/420_autoresize_last_partitions.sh
should run (regardless that this script alone does not help here
because in case of LVM manual adaptions are needed anyway,
cf. the Resizing partitions in MIGRATION_MODE during "rear recover"
section in default.conf:
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L370
Here is the log file: I couldn't see anything that stood out. rear-hana-n3.log
I think I know why layout/prepare/default/420_autoresize_last_partitions.sh doesn't actually do something in this particular case: https://github.com/rear/rear/files/2998736/rear-hana-n3.log contains
+ source /usr/share/rear/layout/prepare/default/420_autoresize_last_partitions.sh
...
++ cp /var/lib/rear/layout/disklayout.conf /var/lib/rear/layout/disklayout.conf.resized_last_partition
...
+++ grep '^disk ' /var/lib/rear/layout/disklayout.conf
++ mv /var/lib/rear/layout/disklayout.conf.resized_last_partition /var/lib/rear/layout/disklayout.conf
i.e. grep '^disk ' /var/lib/rear/layout/disklayout.conf
does not find anything.
@kkoehle we also need at least your var/lib/rear/layout/disklayout.conf plus some more additional files in the /var/lib/rear/ directory and in its sub-directories in the ReaR recovery system, cf. "Debugging issues with Relax-and-Recover" at https://en.opensuse.org/SDB:Disaster_Recovery
Be careful when attaching files here to not make possibly confidential internal information public here. I.e. you may have to obfuscate some values in those files before you upload those files here. On the other hand you should not obfuscate too much so that it would become impossible for us to see what actually goes on on your particular system.
Additionally describe as exact as you can how your replacement system where you run "rear recover" differs from your original system where you had run "rear mkbackup/mkrescue".
@schabrolles @jsmeix, I found the problem: the boot flag never gets set on the correct partition:
hana-n4:~ # parted /dev/mapper/36005076802818158a4000000000004c0
(parted) print
Number Start End Size Type File system Flags
1 1049kB 8389kB 7340kB primary prep, type=41
2 8389kB 64.4GB 64.4GB primary lvm, type=8e
(parted) set 1 boot on
1 1049kB 8389kB 7340kB primary boot, prep, type=41
2 8389kB 64.4GB 64.4GB primary lvm, type=8e
According to https://github.com/rear/rear/issues/2094#issuecomment-487184999 this is a bootloader issue in MIGRATION_MODE so I think the general issue is https://github.com/rear/rear/issues/1437
Stale issue message
Relax-and-Recover (ReaR) Issue Template
Fill in the following items before submitting a new issue (quick response is not guaranteed with free support):
*OS version: SUSE Linux Enterprise Server 12 SP3
ReaR configuration files:
Hardware: IBM PoverVM LPAR s824
System architecture: PPC64LE
Firmware: BIOS and GRUB2
Storage: SAN NPIV IBM v7000
Description of the issue: Rear recover finishes without error, but LPAR will not boot.
Workaround, if any: none.
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
rear.txt