Open Martin-Weiss opened 6 years ago
@Martin-Weiss
We indeed do check for raidctrls one level deeper in cephdisks.py
salt -C 'I@roles:storage' cephdisks.list
This output is used in the proposal runner. Internally cephdisk uses hwinfo to determine the hdd's properties, but as you correctly noted, the kernel lies about it's rotational
status when shadowed by a raid controller. You might be using a raidcontroller we are not checking for. Please confirm that with:
lscpi -vv | grep -i raid
This will tell you what kind of controller you are using.
smartctl -i /dev/<device> -d <raid_ctrl_name>,<bus_id>
gives you more insight about the disk. cephdisks.list
tries to parse that. Due to the lack of hardware we weren't able to test each and every setup possible and it's quite likely that we missed something.
If you can post the output of the mentioned commands, I might be able to provide a fix.
no update here.
server1:/sys # smartctl -i /dev/sda -d megaraid,29 smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.92-6.30-default] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION === Vendor: LENOVO-X Product: AL14SEB180E X Revision: TL44 User Capacity: 1,800,360,124,416 bytes [1.80 TB] Logical block size: 512 bytes Physical block size: 4096 bytes Lowest aligned LBA: 0 Rotation Rate: 10500 rpm Form Factor: 2.5 inches Logical Unit id: 0x500003986833c075 Serial number: 18L0A0F0FAYD Device type: disk Transport protocol: SAS Local Time is: Tue Apr 10 14:59:28 2018 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled
server1:/sys # smartctl -i /dev/sda -d megaraid,30 smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.92-6.30-default] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/sda [megaraid_disk_30] [SAT]: Device open changed type from 'megaraid,30' to 'sat+megaraid,30' === START OF INFORMATION SECTION === Device Model: MTFDDAK240TCB-1AR1ZA 01GV844 01GV847LEN Serial Number: 19D0D069 LU WWN Device Id: 5 00a075 119d0d069 Firmware Version: MD37 User Capacity: 240,057,409,536 bytes [240 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-3 (unknown minor revision code: 0x006d) SATA Version is: SATA >3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Apr 10 14:59:30 2018 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled
FYI - customer has configured RAID 0 on each disk that is presented to the OS for OSDs.
It seems that the whole process will not work, because smartctl does not reflect the RAID 0 "on top of that physical disk".
The only way we can find at the moment is to offload wal/rocksDB based on disk size - but that seems to be impossible with the proposal runner...
This seems to be topical again. We might consider implementing the 'identify by size' feature @jan--f
We have identify by size. So far its simply not been a requirement to consider rotational drives as wal/db options.
We had to realize that some server hardware only comes with RAID controllers that do NOT support JBOD mode (i.e. HP DL380). In case these servers are used for OSD nodes the drives behind these RAID controllers need to be configured as RAID 0 logical drives so that they can be used for OSD data.
Unfortunately the RAID controller reports then all as "rotational 1" to the operating system and due to that the proposal runner can not detect properly that some of the RAID 0 logical drives are SSDs and this results in a setup where the rocksDB/WAL are not offloaded to the SSDs.
I am not sure how we could fix this / enhance this in the proposal runner - but it would be great if we could find a solution as manually manipulating the proposals for the disks is a big challenge..
Maybe we can add some parameters that allow to specify which LUN number behind a specific HBA should be used for journal / rocksDB / WAL?