sonic-net / sonic-platform-common

Python packages which provide a common interface to platform-specific hardware peripherals in SONiC
Other
45 stars 171 forks source link

Enhanced NVMe disk support, added limited eUSB disk support #493

Open assrinivasan opened 1 month ago

assrinivasan commented 1 month ago

Description

  1. Modified storage_devices.py to support class object instantiation for devices with NVMe and eUSB storage disks.
  2. Full support for NVMe devices, limited support for eUSB per promised Future Work on the Storage Monitoring Daemon
  3. Fixed bug within parse_generic_ssd_info() function in Ssdutil class that would overwrite disk_io_reads, disk_io_writes and reserved_blocks information for NVMe disks with N/A after successfully parsing said values. This is due to an indentation bug and has been fixed as part of this commit.

Motivation and Context

This PR is in line with promised future work on the Storage Monitoring Daemon. It adds support for devices with NVMe storage disks.

Needs to be merged only after https://github.com/sonic-net/sonic-buildimage/pull/20053 is merged to master+202405 branches.

How Has This Been Tested?

NVMe

Ran image containing these changes on a device with an NVMe disk and verified that storage disk attributes were getting successfully parsed and updated to STATE_DB.

[root@str3-7060x6-64pe-2:~# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0 386.4M  0 loop
loop1         7:1    0     4G  0 loop /var/log
nvme0n1     259:0    0 223.6G  0 disk
├─nvme0n1p1 259:1    0 223.5G  0 part /boot
│                                     /var/lib/docker
│                                     /host
├─nvme0n1p2 259:2    0    64M  0 part
└─nvme0n1p3 259:3    0     1M  0 part

Syslogs:

2024 Aug 24 07:55:50.392206 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Starting Storage Monitoring Daemon
2024 Aug 24 07:55:50.437050 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Storage Device: nvme0n1, Device Model: ATP AF240GSTJA-AW1, Serial: 23090240-000257
2024 Aug 24 07:55:50.437789 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Polling Interval set to 60 seconds
2024 Aug 24 07:55:50.437789 str3-7060x6-64pe-2 INFO pmon#stormond[906]: FSIO JSON file Interval set to 360 seconds
2024 Aug 24 07:55:50.481670 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Storage Device: nvme0n1, Firmware: 42A4SB6G, health: 100.0%, Temp: 27.0C, FS IO Reads: 95432, FS IO Writes: 18860
2024 Aug 24 07:55:50.481837 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Latest FSIO Reads: 95432, Latest FSIO Writes: 18860
2024 Aug 24 07:55:50.481837 str3-7060x6-64pe-2 INFO pmon#stormond[906]: Disk IO Reads: 3,657,918 [1.87 TB], Disk IO Writes: 1,155,355 [591 GB], Reserved Blocks: 100.0
2024 Aug 24 07:56:00.045910 str3-7060x6-64pe-2 INFO pmon#supervisord 2024-08-24 07:56:00,044 INFO success: stormond entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)

STATE_DB:

root@sonic-device:/usr/local/bin# redis-cli -n 6 HGETALL "STORAGE_INFO|nvme0n1"
 1) "device_model"
 2) "ATP AF240GSTJA-AW1"
 3) "serial"
 4) "23090240-000257"
 5) "firmware"
 6) "42A4SB6G"
 7) "health"
 8) "100.0"
 9) "temperature"
10) "27.0"
11) "latest_fsio_reads"
12) "95432"
13) "latest_fsio_writes"
14) "18971"
15) "disk_io_reads"
16) "3,657,918 [1.87 TB]"
17) "disk_io_writes"
18) "1,155,357 [591 GB]"
19) "reserved_blocks"
20) "100.0"
21) "total_fsio_reads"
22) "95432"
23) "total_fsio_writes"
24) "18971"

eUSB

root@str2-7050qx-32s-acs-03:~# realpath /sys/block/sda/device
/sys/devices/pci0000:00/0000:00:12.2/usb1/1-2/1-2:1.0/host2/target2:0:0/2:0:0:0
root@str2-7050qx-32s-acs-03:~#

Syslogs:

2024 Aug 28 06:37:32.875451 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Starting Storage Monitoring Daemon
2024 Aug 28 06:37:32.875451 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Storage Device: sda, Device Model: SMART EUSB, Serial:
2024 Aug 28 06:37:32.877125 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Polling Interval set to 60 seconds
2024 Aug 28 06:37:32.877125 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: FSIO JSON file Interval set to 360 seconds
2024 Aug 28 06:37:32.906668 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Storage Device: sda, Firmware: N/A, health: N/A%, Temp: N/AC, FS IO Reads: 32835, FS IO Writes: 7658
2024 Aug 28 06:37:32.906803 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Latest FSIO Reads: 32835, Latest FSIO Writes: 7658
2024 Aug 28 06:37:32.906865 str2-7050qx-32s-acs-03 INFO pmon#stormond[466]: Disk IO Reads: N/A, Disk IO Writes: N/A, Reserved Blocks: N/A

STATE_DB:

root@str2-7050qx-32s-acs-03:~# redis-cli -n 6 HGETALL "STORAGE_INFO|sda"
 1) "device_model"
 2) "SMART EUSB"
 3) "serial"
 4) ""
 5) "firmware"
 6) "N/A"
 7) "health"
 8) "N/A"
 9) "temperature"
10) "N/A"
11) "latest_fsio_reads"
12) "32854"
13) "latest_fsio_writes"
14) "7723"
15) "disk_io_reads"
16) "N/A"
17) "disk_io_writes"
18) "N/A"
19) "reserved_blocks"
20) "N/A"
21) "total_fsio_reads"
22) "32854"
23) "total_fsio_writes"
24) "7723"

Additional Information (Optional)

prgeor commented 3 weeks ago

@Junchao-Mellanox @keboliu could you please review?

assrinivasan commented 1 week ago

@Staphylo please help review this PR, TIA.