Closed jakubgs closed 6 months ago
You can find more example of me using ssacli
to configure volumes here:
Just need to click the Load more...
button:
We are able to connect today only 9 SSD disks of 1.6TB capacity (4 servers). If you need all with 1.6TB capacity, then it will be possible to connect the remaining in the next 2 weeks or if you want, we may connect a 3.84TB drive on each of the remaining servers.
The cost for the single additional disk is 20 euro per 1.6TB SSD drive.
Asked about 4TB price and 3TB (if they have)
4TB will cost twice of that, will go ahead.
Pros:
Cons:
ssacli
installation:
echo "deb http://downloads.linux.hpe.com/SDR/repo/mcp jammy/current non-free" | sudo tee /etc/apt/sources.list.d/hp-mcp.list
wget -qO- http://downloads.linux.hpe.com/SDR/hpPublicKey1024.pub | sudo tee -a /etc/apt/trusted.gpg.d/hp-mcp.asc
wget -qO- http://downloads.linux.hpe.com/SDR/hpPublicKey2048.pub | sudo tee -a /etc/apt/trusted.gpg.d/hp-mcp.asc
wget -qO- http://downloads.linux.hpe.com/SDR/hpPublicKey2048_key1.pub | sudo tee -a /etc/apt/trusted.gpg.d/hp-mcp.asc
wget -qO- http://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub | sudo tee -a /etc/apt/trusted.gpg.d/hp-mcp.asc
apt update
apt install ssacli
Disk are installed:
❯ ansible nimbus-mainnet-metal -i ansible/inventory/test -a 'sudo ssacli ctrl slot=0 pd allunassigned show'
linux-06.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SAS SSD, 3.8 TB, OK)
linux-07.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS SSD, 3.8 TB, OK)
linux-02.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 2I:2:6 (port 2I:box 2:bay 6, SAS SSD, 3.8 TB, OK)
linux-04.ih-eu-mda1.nimbus.mainnet | FAILED | rc=1 >>
Error: The controller identified by "slot=0" was not detected.non-zero return code
linux-01.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 2I:2:6 (port 2I:box 2:bay 6, SAS SSD, 3.8 TB, OK)
linux-03.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS SSD, 3.8 TB, OK)
linux-05.ih-eu-mda1.nimbus.mainnet | CHANGED | rc=0 >>
Smart Array P420i in Slot 0 (Embedded)
Unassigned
physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SAS SSD, 3.8 TB, OK)
linux-04 has a different slot:
❯ sudo ssacli ctrl slot=1 pd allunassigned show
Smart Array P222 in Slot 1
Unassigned
physicaldrive 2I:1:3 (port 2I:box 1:bay 3, SAS SSD, 3.8 TB, OK)
IH had an issue with disks. They fixed in on linux-01 and I was able to set them up.
Was done with approximately these actions:
sudo ssacli ctrl slot=0 pd all show status
sudo ssacli ctrl slot=0 create type=ld drives=DRIVE (cheange here, eg 2I:2:6)
sudo ssacli ctrl slot=0 ld all show status
[local] ansible-playbook -i ansible/inventory/test ansible/bootstrap.yml --limit=linux-01.ih-eu-mda1.nimbus.mainnet -Dv -t role::bootstrap:volumes
docker-compose -f docker-compose.exporter.yml -f docker-compose.yml stop
sudo systemctl stop syslog
sudo rsync -Pa /mnt/sdc/geth-mainnet /mnt/sdd/geth-mainnet
[ansible/group_vars/ih-eu-mda1.yml] change /docker
[local] ansible-playbook -i ansible/inventory/test ansible/bootstrap.yml --limit=linux-01.ih-eu-mda1.nimbus.mainnet -Dv -t role::bootstrap:volumes
docker-compose -f docker-compose.exporter.yml -f docker-compose.yml start
[grafana] check geth graphs - syncing
sudo systemctl stop beacon-node-mainnet-*
sudo rsync -Pa sdb/beacon-node-mainnet-* sdb/era sdd/sdb/
sudo ssacli ctrl slot=0 ld all show status
sudo ssacli ctrl slot=0 ld 2 delete
sudo ssacli ctrl slot=0 ld 3 delete
sudo ssacli ctrl slot=0 pd all show status
sudo ssacli ctrl slot=0 create type=ld drives=DRIVE1,DRIVE2 raid=0 # (eg 1I:2:1,1I:2:2)
[local] ansible-playbook -i ansible/inventory/test ansible/bootstrap.yml --limit=linux-01.ih-eu-mda1.nimbus.mainnet -Dv -t role::bootstrap:volumes
sudo systemctl start beacon-node-mainnet-*
[grafana] check nimbus graphs - syncing
Need to fine-tune a bit:
umount
before running ansibleBTW if docs or support asks about using the different tool - here is the timeline
hpacucli
(versions 9.10 - 9.40, 2012 - 2014, probaly before too)hpssacli
(versions 2.0 - 2.40, 2014 - 2016)ssacli
(versions 3.10 - 6.30, 2017 - now)Done, disks are setup with these commands:
[local] ansible linux-05.ih-eu-mda1.nimbus.mainnet,linux-06.ih-eu-mda1.nimbus.mainnet,linux-07.ih-eu-mda1.nimbus.mainnet -a 'sudo systemctl stop consul'
sudo ssacli ctrl slot=0 pd all show status; sudo ssacli ctrl slot=0 ld all show status
sudo ssacli ctrl slot=0 create type=ld drives=DRIVE (cheange here, eg 2I:2:6)
sudo ssacli ctrl slot=0 ld all show status
[local] ansible-playbook -i ansible/inventory/test ansible/bootstrap.yml --limit=HOSTNAME -Dv -t role::bootstrap:volumes
docker-compose -f /docker/geth-mainnet/docker-compose.exporter.yml -f /docker/geth-mainnet/docker-compose.yml stop
sudo systemctl stop syslog beacon-node-mainnet-*
sudo rsync --stats -hPa --info=progress2,name0 /docker/geth-mainnet /docker/log /data/beacon* /data/era /mnt/sdd/
sudo umount /mnt/sdb /data /docker /mnt/sdc
sudo ssacli ctrl slot=0 ld all show status
sudo ssacli ctrl slot=0 ld 2 delete
sudo ssacli ctrl slot=0 ld 3 delete
sudo ssacli ctrl slot=0 pd all show status
sudo ssacli ctrl slot=0 create type=ld raid=0 drives=DRIVE1,DRIVE2 # (eg 1I:2:1,1I:2:2)
[local] ansible-playbook -i ansible/inventory/test ansible/bootstrap.yml --limit=HOSTNAME -Dv -t role::bootstrap:volumes
sudo rsync --stats -hPa --info=progress2,name0 /docker/beacon* /docker/era /data/
docker-compose -f /docker/geth-mainnet/docker-compose.exporter.yml -f /docker/geth-mainnet/docker-compose.yml start
sudo systemctl start beacon-node-mainnet-* syslog
[grafana] check geth graphs - syncing
[grafana] check nimbus graphs - syncing
Smth is missing before the second ansible-playbook
run, causing the issues:
I didn't research much and just put command above.
One finding - linux-02.ih-eu-mda1.nimbus.mainnet
has some wierd disk attached, but not others:
❯ lsblk /dev/sdd
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdd 8:48 0 256M 1 disk
`-sdd1 8:49 0 251M 1 part
❯ sudo fdisk -l /dev/sdd
Disk /dev/sdd: 256 MiB, 268435456 bytes, 524288 sectors
Disk model: LUN 00 Media 0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000046
Device Boot Start End Sectors Size Id Type
/dev/sdd1 63 514079 514017 251M c W95 FAT32 (LBA)
❯ ls -l /dev/disk/by-id /dev/disk/by-path/ | grep sdd
lrwxrwxrwx 1 root root 9 Feb 22 19:05 usb-HP_iLO_LUN_00_Media_0_000002660A01-0:0 -> ../../sdd
lrwxrwxrwx 1 root root 10 Feb 22 19:05 usb-HP_iLO_LUN_00_Media_0_000002660A01-0:0-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 9 Feb 22 19:05 pci-0000:00:1d.0-usb-0:1.3.1:1.0-scsi-0:0:0:0 -> ../../sdd
lrwxrwxrwx 1 root root 10 Feb 22 19:05 pci-0000:00:1d.0-usb-0:1.3.1:1.0-scsi-0:0:0:0-part1 -> ../../sdd1
Looks like it's some USB 256Mb drive.
ChatGPT says it can smth to do with HP iLO (Integrated Lights-Out) LUN (Logical Unit Number). I have no idea, whats that.
Thanks for getting this done.
It's about time we increase the storage available for both Docker containers(Geth) and Systemd services(Beacon Nodes):
The current layout involves single logical volume per single physical volume(SSD) configured in the controller.
The migration to RAID0 logical volumes using two SSDs using a HPE Smart Array utility is documented here: https://docs.infra.status.im/general/hp_smart_array_raid.html
The steps for the migration of each host will look like this:
/data
files to temporary migration SSD./data
logical volume and re-create it with two physical volumes(SSDs) as one RAID0 logical volume./data
volume./docker
volume.I would recommend creating a single support ticket to order 2 extra SSDs of the same type for all
nimbus.mainnet
hosts, and then manage migration of each host in the comments of that ticket.