xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
361 stars 171 forks source link

[FVT]rh7.2 installation on X86_64 machine with 4 disks failed with "storage configuration failed: failed to find a suitable stage1 device" #821

Closed caomengmeng closed 8 years ago

caomengmeng commented 8 years ago

ENV: MN:

[root@c910f04x21 ~]# cat/etc/*release
-bash: cat/etc/*release: No such file or directory
[root@c910f04x21 ~]# cat /etc/*release
Red Hat Enterprise Linux Server release 6.7 (Santiago)
Red Hat Enterprise Linux Server release 6.7 (Santiago)
[root@c910f04x21 ~]# rpm -qa | grep -i xCAT
xCAT-genesis-scripts-x86_64-2.12-snap201603161531.noarch
syslinux-xcat-3.86-2.noarch
xCAT-genesis-base-x86_64-2.9-snap201504212134.noarch
perl-xCAT-2.12-snap201603161530.noarch
xCAT-server-2.12-snap201603161531.noarch
xCAT-buildkit-2.12-snap201603161531.noarch
elilo-xcat-3.14-4.noarch
conserver-xcat-8.1.16-10.x86_64
grub2-xcat-2.02-0.16.el7.snap201506090204.noarch
xCAT-client-2.12-snap201603161530.noarch
xCAT-2.12-snap201603161531.x86_64
ipmitool-xcat-1.8.11-3.x86_64

Description: Failed to install rh7.2 on CN, but can install sles12.1 successfully. Using rcons CN -f to track the installation process, it stops with below screen:

Generating updated storage configuration
storage configuration failed: failed to find a suitable stag                                                                                   ============================================================                                                                                   ============================================================                                                                                   Installation========

 1) [x] Language settings                 2) [x] Timezone se                                                                                   ttings
        (English (United States))                (US/Eastern                                                                                           (http://c910f04x21:80/install/r          (Custom sof                                                                                   tware selected)ation source               4) [x] Software se                                                                                           hels7.2/x86_64)                   6) [x] Kdump
 5) [!] Installation Destination                 (Kdump is e                                                                                   nabled)
        (Error checking storage configu   8) [ ] User creati                                                                                   on
        ration)                                  (No user wi                                                                                   ll be created)
 7) [x] Network configuration
        (Wired (eno1) connected)
  Please make your choice from above ['q' to quit | 'b' to b                                                                                   egin installation |
  'r' to refresh]:
immarvin commented 8 years ago

please attach the error message in the syslog

caomengmeng commented 8 years ago

Hope it helpful: https://access.redhat.com/solutions/1369253 https://bugzilla.redhat.com/show_bug.cgi?id=1168118

immarvin commented 8 years ago

LTC ticket: https://bugzilla.linux.ibm.com/show_bug.cgi?id=139600

daniceexi commented 8 years ago

Will try to recreate the issue first and to see the possible fix.

immarvin commented 8 years ago

hi, there is still no update from LTC.

Can someone in FVT help to recreate the problem so that we can try the possible work around? @tingtli

caomengmeng commented 8 years ago

Yes, I will support a test environment to reproduce this issue.

caomengmeng commented 8 years ago

Passed.

[root@c910f04x21 ~]# lsdef c910f04x20 | grep status
    status=booted
    statustime=04-29-2016 02:56:43
    updatestatus=failed
    updatestatustime=03-25-2016 04:09:27
[root@c910f04x21 ~]# lsxcatd -a
Version 2.12 (git commit b908019a59bcb53fbb758ae99e1f0cb4a13d648d, built Thu Apr 21 09:30:59 EDT 2016)
This is a Management Node
dbengine=SQLite
daniceexi commented 8 years ago

@caomengmeng what we did to fix this issue? Or we just could not recreate this issue now.

immarvin commented 8 years ago

hi,This problem is also observed in rhels7.2 installation on Power8LE NV and "bootloader --boot-drive=sdx" suggested by Redhat works.

daniceexi commented 8 years ago

@immarvin Please add this as a known issue in 2.12 release notes and move it to 2.12.1.

whowutwut commented 8 years ago

Can we get a better clarification on the steps to do the workaround provided by @immarvin so that customers will know what to do if they hit this issue.... Should be line be added to the .tmpl file? Is there a patch we should provide? Where in the kickstart file should it be added? How would the user know which disk should be selected, if they have two disks... sometimes it's /dev/sda /dev/sdb but in our S822LC boxes, it seems to be /dev/sdk and /dev/sdj.

I don't see any entry in the release notes yet, please create a placeholder .. or create the entry..

immarvin commented 8 years ago

hi @whowutwut , we might need to provide a patch/fix of kickstart and pre script for redhat7.2. Redhat provides some suggestions for us:

hi,
This problem is also observed in rhels7.2 installation on Power8LE and 
 "bootloader --boot-drive=sdb" works. thanks

There is still a question for us, the   "bootloader --boot-drive=sdb" is specified in the kickstart file before installation, is it possible to specify "bootloader --boot-drive=xxx"  in the %pre section during installation?  In our project, since the installation disk is determine by some script in the %pre section.

------- Comment From dlehman@redhat.com 2016-05-11 13:16:10 EDT-------
Just write out the bootloader command to a file from %pre, eg:

echo "bootloader --boot-drive=sdb" > /tmp/bootloader-cmd

And then include it in your kickstart like this:

%include /tmp/bootloader-cmd

That should work for you.

The problem is that this problem can not be recreated now, I need to verify the fix in a environment with problem.

daniceexi commented 8 years ago

@immarvin we can create a pull request first to see whether it's safe for 2.12

immarvin commented 8 years ago

The PR https://github.com/xcat2/xcat-core/pull/1118