aristanetworks / sonic

Open source drivers and initialization library for Arista platforms running SONiC
GNU General Public License v2.0
25 stars 30 forks source link

[chassis sup] psu7 and psu8 power is incorrectly set to 0.0 #93

Closed wenyiz2021 closed 1 year ago

wenyiz2021 commented 1 year ago

hi @Staphylo @patrickmacarthur , this is failing platform_tests/test_power_budget_info.py::test_power_redis_db recently, it used to be passing.

admin@str2-7804-sup-1:~$ redis-dump -d 6 -y -k "*power*"
{
  "CHASSIS_INFO|chassis_power_budget 1": {
    "expireat": 1689008695.6872308,
    "ttl": -0.001,
    "type": "hash",
    "value": {
      "": "",
      "Consumed Power FABRIC-CARD0": "518.0",
      "Consumed Power FABRIC-CARD1": "518.0",
      "Consumed Power FABRIC-CARD2": "518.0",
      "Consumed Power FABRIC-CARD3": "518.0",
      "Consumed Power FABRIC-CARD4": "518.0",
      "Consumed Power FABRIC-CARD5": "518.0",
      "Consumed Power LINE-CARD0": "622.0",
      "Consumed Power LINE-CARD2": "794.0",
      "Consumed Power LINE-CARD4": "622.0",
      "Consumed Power SUPERVISOR0": "72.0",
      "Supplied Power psu3": "3000.0",
      "Supplied Power psu4": "3000.0",
      "Supplied Power psu5": "3000.0",
      "Supplied Power psu6": "3000.0",
      "Supplied Power psu7": "0.0",
      "Supplied Power psu8": "0.0",
      "Total Consumed Power": "5218.0",
      "Total Supplied Power": "12000.0"
    }
  }
}admin@str2-7804-sup-1:~$ 
Staphylo commented 1 year ago

Hi @wenyiz2021, this is indeed not expected if all PSUs are properly powered on.

A few basic questions first: 1) Are all the power supplies connected? 2) Are all the outlets powered on the PDU? 3) Given that these PSUs are dual input, do you know on which input the 2 failing power supplies are connected?

Could you provide me with the output of the following command? grep ERROR /var/log/arista.log (just the group of lines corresponding to the boot where this issue happened by checking log dates) arista show platform environment arista show chassis summary

wenyiz2021 commented 1 year ago
}admin@str2-7804-sup-1:~$ show platform psu
PSU     Model                Serial         HW Rev    Voltage (V)    Current (A)    Power (W)    Status       LED
------  -------------------  -------------  --------  -------------  -------------  -----------  -----------  -----
PSU 1   PWR-D1-3041-AC-BLUE  THAG118330032  P1.2      12.351         30.718         379.5        OK           off
PSU 2   PWR-D1-3041-AC-BLUE  THAG118320008  P1.2      12.375         30.406         377.0        OK           off
PSU 3   PWR-D1-3041-AC-BLUE  THAG118320025  P1.2      12.343         34.875         430.5        OK           off
PSU 4   PWR-D1-3041-AC-BLUE  THAG118330042  P1.2      12.351         30.531         377.0        OK           off
PSU 5   PWR-D1-3041-AC-BLUE  THAG118320007  P1.2      12.335         30.406         375.0        OK           off
PSU 6   PWR-D1-3041-AC-BLUE  THAG118310040  P1.2      12.367         30.718         381.0        OK           off
PSU 7   N/A                  N/A            N/A       N/A            N/A            N/A          OK           off
PSU 8   N/A                  N/A            N/A       N/A            N/A            N/A          OK           off
PSU 9   N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 10  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 11  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 12  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
wenyiz2021 commented 1 year ago

hi @Staphylo , thanks for the quick response. We did find some psus were not powered on for some reason, we turned on those psus, but still have following o/p, esp for psu7 psu8:

Here is the info:

admin@str2-7804-sup-1:~$ sudo grep ERROR /var/log/arista.log
admin@str2-7804-sup-1:~$ arista show platform environment
ERROR: You must be root to use this feature
admin@str2-7804-sup-1:~$ sudo arista show platform environment
Name       Model               Serial          Power   Max   Status
---------- ------------------- --------------- ------- ----- ------
psu1       PWR-D1-3041-AC-BLUE THAG118330032   410.5   3000  True  
psu2       PWR-D1-3041-AC-BLUE THAG118320008   413.0   3000  True  
psu3       PWR-D1-3041-AC-BLUE THAG118320025   467.0   3000  True  
psu4       PWR-D1-3041-AC-BLUE THAG118330042   410.0   3000  True  
psu5       PWR-D1-3041-AC-BLUE THAG118320007   405.5   3000  True  
psu6       PWR-D1-3041-AC-BLUE THAG118310040   412.5   3000  True  
psu7       N/A                 N/A             N/A     N/A   True  
psu8       N/A                 N/A             N/A     N/A   True  
psu9       N/A                 N/A             N/A     N/A   False 
psu10      N/A                 N/A             N/A     N/A   False 
psu11      N/A                 N/A             N/A     N/A   False 
psu12      N/A                 N/A             N/A     N/A   False 
admin@str2-7804-sup-1:~$ sudo arista show chassis summary
Sku: DCS-7808-CH
Serial: TMO20180245
Supervisors:
  1: DCS-7800-SUP1A (SSN20290100)
Linecards:
  3: 7800R3-48CQ2-LC (SSN20220006)
  4: not present
  5: 7800R3A-36DM2-LC (SGD21190878)
  6: not present
  7: 7800R3-48CQ2-LC (SSN20220011)
  8: not present
  9: not present
  10: not present
Fabrics:
  51: 7808R3A-FM (SSN20290018)
  52: 7808R3A-FM (SSN20290015)
  53: 7808R3A-FM (SSN20290033)
  54: 7808R3A-FM (SSN20290026)
  55: 7808R3A-FM (SSN20290017)
  56: 7808R3A-FM (SSN20290021)
Psus:
  1: PWR-D1-3041-AC-BLUE (THAG118330032)
  2: PWR-D1-3041-AC-BLUE (THAG118320008)
  3: PWR-D1-3041-AC-BLUE (THAG118320025)
  4: PWR-D1-3041-AC-BLUE (THAG118330042)
  5: PWR-D1-3041-AC-BLUE (THAG118320007)
  6: PWR-D1-3041-AC-BLUE (THAG118310040)
  7: Error
  8: Error
  9: not present
  10: not present
  11: not present
  12: not present
wenyiz2021 commented 1 year ago

oh I found previous search of arista.log is incomplete:

admin@str2-7804-sup-1:/var/log$ sudo grep ERROR /var/log/arista.log*
/var/log/arista.log.1:2023-07-10 14:02:06.196466 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:02:06.201399 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:02:06.201480 ERROR: PSU 7 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 14:02:06.206700 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:02:06.211314 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:02:06.211397 ERROR: PSU 8 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 14:12:37.188341 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:12:37.193400 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:12:37.193600 ERROR: PSU 7 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 14:12:37.199136 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:12:37.204156 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 14:12:37.204263 ERROR: PSU 8 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 16:24:10.230765 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 16:24:10.235790 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 16:24:10.235862 ERROR: PSU 7 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 16:24:10.240982 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 16:24:10.245635 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 16:24:10.245823 ERROR: PSU 8 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 17:31:26.282669 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 17:31:26.287788 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 17:31:26.287900 ERROR: PSU 7 unknown, discovery failed
/var/log/arista.log.1:2023-07-10 17:31:26.293107 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 17:31:26.298012 ERROR: something happened while trying to detect the psu: [Errno 5] Input/output error
/var/log/arista.log.1:2023-07-10 17:31:26.298115 ERROR: PSU 8 unknown, discovery failed
wenyiz2021 commented 1 year ago
Are all the power supplies connected?
 - wenyi: yes
Are all the outlets powered on the PDU?
- wenyi: they were off for some reasons. @arlakshm and I turned them on
Given that these PSUs are dual input, do you know on which input the 2 failing power supplies are connected?
 - wenyi: 
pdu-125,20,str2-7804-sup-1,PSU7
pdu-124,20,str2-7804-sup-1,PSU8

for the rest questions

wenyiz2021 commented 1 year ago

ok after reboot I see psu7 and psu8 are recovered

admin@str2-7804-sup-1:~$ show platform psu
PSU     Model                Serial         HW Rev    Voltage (V)    Current (A)    Power (W)    Status       LED
------  -------------------  -------------  --------  -------------  -------------  -----------  -----------  -----
PSU 1   PWR-D1-3041-AC-BLUE  THAG118330032  P1.2      12.351         30.625         378.5        OK           off
PSU 2   PWR-D1-3041-AC-BLUE  THAG118320008  P1.2      12.375         30.312         376.0        OK           off
PSU 3   PWR-D1-3041-AC-BLUE  THAG118320025  P1.2      12.343         34.625         427.5        OK           off
PSU 4   PWR-D1-3041-AC-BLUE  THAG118330042  P1.2      12.351         30.687         379.0        OK           off
PSU 5   PWR-D1-3041-AC-BLUE  THAG118320007  P1.2      12.335         30.25          373.0        OK           off
PSU 6   PWR-D1-3041-AC-BLUE  THAG118310040  P1.2      12.375         30.343         375.5        OK           off
PSU 7   PWR-D1-3041-AC-BLUE  THAG1200400CL  P00       12.367         30.218         374.0        OK           off
PSU 8   PWR-D1-3041-AC-BLUE  THAG1200400GC  P00       12.367         30.437         376.5        OK           off
PSU 9   N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 10  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 11  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
PSU 12  N/A                  N/A            N/A       N/A            N/A            N/A          NOT PRESENT  off
admin@str2-7804-sup-1:~$ redis-dump -d 6 -y -k "*power*"
{
  "CHASSIS_INFO|chassis_power_budget 1": {
    "expireat": 1689117785.6219234,
    "ttl": -0.001,
    "type": "hash",
    "value": {
      "": "",
      "Consumed Power FABRIC-CARD0": "518.0",
      "Consumed Power FABRIC-CARD1": "518.0",
      "Consumed Power FABRIC-CARD2": "518.0",
      "Consumed Power FABRIC-CARD3": "518.0",
      "Consumed Power FABRIC-CARD4": "518.0",
      "Consumed Power FABRIC-CARD5": "518.0",
      "Consumed Power LINE-CARD0": "622.0",
      "Consumed Power LINE-CARD2": "794.0",
      "Consumed Power LINE-CARD4": "622.0",
      "Consumed Power SUPERVISOR0": "72.0",
      "Supplied Power psu1": "3000.0",
      "Supplied Power psu2": "3000.0",
      "Supplied Power psu3": "3000.0",
      "Supplied Power psu4": "3000.0",
      "Supplied Power psu5": "3000.0",
      "Supplied Power psu6": "3000.0",
      "Supplied Power psu7": "3000.0",
      "Supplied Power psu8": "3000.0",
      "Total Consumed Power": "5218.0",
      "Total Supplied Power": "24000.0"
    }
  }

closing this issue. thanks @Staphylo and @kenneth-arista for help