aristanetworks / sonic

Open source drivers and initialization library for Arista platforms running SONiC
GNU General Public License v2.0
22 stars 30 forks source link

Performance issue with sfputil? #7

Closed yxieca closed 4 years ago

yxieca commented 6 years ago

Hardware platform: Arista 7260 CX3 - 64, 62 out of 64 ports with tranceiver plugged in.

It appears that reading a single tranceiver information takes about 3 seconds, Reading all takes 3 minutes (3 sec x 62 ports).

I didn't read sfputil code extensively, I think it reads the tranceiver information field by field, with opening/closing file for each field? (does open/close file translate into I2C bus lock and offset programming for each read/write?)

In my past experience, I think reading a single SFP diag page + ID page could be done in a few milliseconds when the read uses I2C continuous read/write mode. Basically in this mode, reader/write locks the I2C bus once, program the initial offset once, and keep on reading/writing until done for a contiguous range. Could we use similar optimization here?

admin@sonic:~$ time sudo sfputil >sfp.txt

real 3m12.421s user 0m0.844s sys 0m0.124s

admin@sonic:~$ time sudo sfputil -p Ethernet236 Ethernet236: SFP detected Connector : No separable connector Encoding : 64B66B Extended Identifier : Unknown Extended RateSelect Compliance : QSFP+ Rate Select Version 1 Identifier : Unknown Length Cable Assembly(m) : 1 Nominal Bit Rate(100Mbs) : 255 Specification compliance : SAS/SATA compliance codes : SAS 3.0G Vendor Date Code(YYYY-MM-DD Lot) : 2017-07-24 Vendor Name : Amphenol Vendor OUI : 78-a7-14 Vendor PN : FOQQD33P00001 Vendor Rev : A Vendor SN : APE17300017DAP

real 0m3.345s user 0m0.264s sys 0m0.072s

Staphylo commented 6 years ago

We noticed the poor performance of the xcvr eeprom read on sonic accross our platforms. It is related to default values we are using in some places. These values were chosen to ensure all the i2c devices work. The fix will need to define per device or per bus values to improve the perfs. We have this improvement in the list of things to do.

yxieca commented 6 years ago

Thanks for the update!

Staphylo commented 4 years ago

The performance issue is still being investigated, but hard to solve given the synchronous behavior of the i2c subsystem in the kernel. This issue is however not directly impacting SONiC anymore since xcvr state is now published to STATE_DB. The sfpshow returns the information instantaneously. Please reopen if this is still a blocker.