SUSE / DeepSea

A collection of Salt files for deploying, managing and automating Ceph.
GNU General Public License v3.0
161 stars 75 forks source link

Proposal generator does not detect SSD behind a RAID controller #819

Open Martin-Weiss opened 6 years ago

Martin-Weiss commented 6 years ago

We had to realize that some server hardware only comes with RAID controllers that do NOT support JBOD mode (i.e. HP DL380). In case these servers are used for OSD nodes the drives behind these RAID controllers need to be configured as RAID 0 logical drives so that they can be used for OSD data.

Unfortunately the RAID controller reports then all as "rotational 1" to the operating system and due to that the proposal runner can not detect properly that some of the RAID 0 logical drives are SSDs and this results in a setup where the rocksDB/WAL are not offloaded to the SSDs.

I am not sure how we could fix this / enhance this in the proposal runner - but it would be great if we could find a solution as manually manipulating the proposals for the disks is a big challenge..

Maybe we can add some parameters that allow to specify which LUN number behind a specific HBA should be used for journal / rocksDB / WAL?

jschmid1 commented 6 years ago

@Martin-Weiss We indeed do check for raidctrls one level deeper in cephdisks.py

salt -C 'I@roles:storage' cephdisks.list

This output is used in the proposal runner. Internally cephdisk uses hwinfo to determine the hdd's properties, but as you correctly noted, the kernel lies about it's rotational status when shadowed by a raid controller. You might be using a raidcontroller we are not checking for. Please confirm that with:

lscpi -vv | grep -i raid This will tell you what kind of controller you are using.

smartctl -i /dev/<device> -d <raid_ctrl_name>,<bus_id> gives you more insight about the disk. cephdisks.list tries to parse that. Due to the lack of hardware we weren't able to test each and every setup possible and it's quite likely that we missed something.

If you can post the output of the mentioned commands, I might be able to provide a fix.

jschmid1 commented 6 years ago

no update here.

Martin-Weiss commented 6 years ago

server1:/sys # smartctl -i /dev/sda -d megaraid,29 smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.92-6.30-default] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Vendor: LENOVO-X Product: AL14SEB180E X Revision: TL44 User Capacity: 1,800,360,124,416 bytes [1.80 TB] Logical block size: 512 bytes Physical block size: 4096 bytes Lowest aligned LBA: 0 Rotation Rate: 10500 rpm Form Factor: 2.5 inches Logical Unit id: 0x500003986833c075 Serial number: 18L0A0F0FAYD Device type: disk Transport protocol: SAS Local Time is: Tue Apr 10 14:59:28 2018 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled

server1:/sys # smartctl -i /dev/sda -d megaraid,30 smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.92-6.30-default] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/sda [megaraid_disk_30] [SAT]: Device open changed type from 'megaraid,30' to 'sat+megaraid,30' === START OF INFORMATION SECTION === Device Model: MTFDDAK240TCB-1AR1ZA 01GV844 01GV847LEN Serial Number: 19D0D069 LU WWN Device Id: 5 00a075 119d0d069 Firmware Version: MD37 User Capacity: 240,057,409,536 bytes [240 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-3 (unknown minor revision code: 0x006d) SATA Version is: SATA >3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Apr 10 14:59:30 2018 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled

Martin-Weiss commented 6 years ago

FYI - customer has configured RAID 0 on each disk that is presented to the OS for OSDs.

Martin-Weiss commented 6 years ago

It seems that the whole process will not work, because smartctl does not reflect the RAID 0 "on top of that physical disk".

Martin-Weiss commented 6 years ago

The only way we can find at the moment is to offload wal/rocksDB based on disk size - but that seems to be impossible with the proposal runner...

jschmid1 commented 6 years ago

This seems to be topical again. We might consider implementing the 'identify by size' feature @jan--f

jan--f commented 6 years ago

We have identify by size. So far its simply not been a requirement to consider rotational drives as wal/db options.