xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
356 stars 170 forks source link

Inconsistent disk selection on servers with multiple disks #5196

Closed whowutwut closed 6 years ago

whowutwut commented 6 years ago

@cxhong and I are loading software on Supermicro Big Data servers (in this case p9 Boston). Doing a generic rinstall of the node with a RHEL 7.5 GA image seems to pick different disks to install on.

I noticed this when standing in front of the server....

The first server picks the 5th disk ... because I see the active light flashing ...

boston32 (more than 2... maybe 6 physical disks)

image

The second server picks the 1st disk... because I see the active light flashing ....

boston30 (2 physical disks)

image

@xuweibj Do you know what is going on here?

zet809 commented 6 years ago

Can we log on these 2 hosts to collect some information about the hard disks?

  1. If the WWN of harddisk available, will use WWN to sort the disks.
  2. If not get WWN, use pcie path
whowutwut commented 6 years ago

Yes, boston32 is the first one and boston30 is the 2nd one. Under stratton

xuweibj commented 6 years ago

Hi @whowutwut , For boston32, if no OS has been installed on any disk, we will choose the one with the smallest WWN value. (All disks' driver are the same.)

sda
0x5000c500947f6dd7
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:0/0:2:0:0/block/sda
sdb
0x5000c500947f7323
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:1/0:2:1:0/block/sdb
sdc
0x5000c500947f6dfb
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:2/0:2:2:0/block/sdc
sdd
0x5000c500947f787b
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:3/0:2:3:0/block/sdd

sde    --------------------*
0x5000c500947f562b
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:4/0:2:4:0/block/sde

sdf
0x5000c500947f7a17
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:5/0:2:5:0/block/sdf
sdg
0x5000c500947f5b5b
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:6/0:2:6:0/block/sdg
sdh
0x5000c500947f5717
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:7/0:2:7:0/block/sdh
sdi
0x5000c500947f79cb
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:8/0:2:8:0/block/sdi
sdj
0x5000c500947f6edb
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:9/0:2:9:0/block/sdj

From the info above, we will choose sde as the install disk.

But for boston30, just 2 disks on it. Is this the 2rd one?

[root@boston30 ~]# ls /dev/sd
sda   sda1  sda2  sda3  sda4  sda5  sdb
whowutwut commented 6 years ago

@xuweibj I did not investigate this much yesterday, too quick to write up the i ssue..., but I think i counted the drives wrong, i was counting top -> botton, left -> right, but I think it goes bottom -> top, left -> right.

So the 5th drive is chosen from the 1st server , and the 1st drive is chosen from the 2nd server The question here is why isn't the 1st drive chosen from the 1st server?

We should have 10 drives in the 1st server (but I have to check physically..)

Looking at this documentation under STAT drives, https://www.ibm.com/support/knowledgecenter/POWER9/p9eip/p9eip22p_drive_install_details.htm could this be the reason, that the mini-SAS drive connection we chose is the B one.. first drive in the 2nd connector?

image

xuweibj commented 6 years ago

@whowutwut

The logic in getinstdisk is that will choose the disk with the smallest WWN.

For boston30, sda is the one with smallest WWN, so choose sda.

sda
0x5000c500947f6e47
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:0/0:2:0:0/block/sda
sdb
0x5000c500947f736f
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:1/0:2:1:0/block/sdb

For boston32, sde is the smallest, so choose it and sde is the 5th.

zet809 commented 6 years ago

So, let's close this issue?

whowutwut commented 5 years ago

@lychen214 Eric, do you have any idea about this behavior above? When selecting a physical disk on a Boston server, we get inconsistent results using the WWN value...

lychen214 commented 5 years ago

@whowutwut Per comment below, the tool will pick up the smallest WWN of disk. And it looks like it worked as expected on boston30 and boston32. But I have no idea how the WWN value was determined from OS perspective. Did you see the tool was using the non-smallest WWN to do the installation? Please help clarify it. Thanks.

The logic in getinstdisk is that will choose the disk with the smallest WWN.

For boston30, sda is the one with smallest WWN, so choose sda.

sda
0x5000c500947f6e47
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:0/0:2:0:0/block/sda
sdb
0x5000c500947f736f
/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:1/0:2:1:0/block/sdb

For boston32, sde is the smallest, so choose it and sde is the 5th.

whowutwut commented 5 years ago

If we are selecting correctly based on the smallest WWN, then one boston node, the WWN number resulted in the 0 disk of the mini-SAS connector 1... on another boston node, the smallest WWN number resulted in the 0 disk of the mini-SAS connector 2.

That's what I don't understand, why one boston node installed onto the bottom left drive (green circle), while another installed onto the 2nd drive in the second column of the server (red circle)....

image

Tag @xuweibj too

whowutwut commented 5 years ago

@xuweibj above you said:

For boston32, if no OS has been installed on any disk, we will choose the one with the smallest WWN value. (All disks' driver are the same.)

What happens if an OS has already been installed on any of the disks?

We are seeing a similar issue with the Habanero boxes .. where there are 12 drives.. and an OS was previously installed on sda , unknown by what provisioninig tool. but xCAT installs the OS onto sdb ; which is causing problems because now there are 2 bootable OS's and it's booting the wrong one.

xuweibj commented 5 years ago

@whowutwut If only one disk has been installed OS on it, will choose it. If more than one, choose the one with the smallest WWN value.

whowutwut commented 5 years ago

@xuweibj Even if switching RHEL & Ubuntu?

xuweibj commented 5 years ago

Yes, whatever the current OS is, just check whether has OS installed on.

lychen214 commented 5 years ago

@whowutwut Would it still start from the 0 disk of the mini-SAS connector 2 if you use the clean hard drives?

zet809 commented 5 years ago

A possible reason that not choose the Disk that have OS installed is, the FS is not support to mount in initrd. In getinstdisk, we try to mount the partition, then, check whether this is vmlinu* in this partition, if there is, the harddisk which partition is located will be treat as the install disk.

robin2008 commented 5 years ago

Any technical supporting for choosing the smallest WWN value for 1st disk? I check many Boston/Briggs sever in our environment, not the sda is the smallest WWN.

For example

lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdd               8:48   0   1.8T  0 disk
sdb               8:16   0   1.8T  0 disk
sde               8:64   0   118G  0 disk
|-sde2            8:66   0   512M  0 part /boot
|-sde5            8:69   0 113.5G  0 part
| `-system-root 253:0    0 113.5G  0 lvm  /
|-sde3            8:67   0     4G  0 part [SWAP]
|-sde1            8:65   0     8M  0 part
`-sde4            8:68   0     1K  0 part
sdc               8:32   0   1.8T  0 disk
sda               8:0    0   1.8T  0 disk
`-sda1            8:1    0   1.8T  0 part /data