opnsense / src

OPNsense operating system on top of FreeBSD
https://opnsense.org/
Other
351 stars 149 forks source link

axgbe: add support for the i2c interface in iflib #178

Closed swhite2 closed 3 weeks ago

swhite2 commented 1 year ago

Iflib provides support for the I2C ioctl(), primarily used by ifconfig -v, to fetch diagnostics data from both the static and user-writable part of EEPROMs in SFP(+) modules in cases where such a driver is loaded. axgbe does not implements this at this moment, but boilerplate is already present (https://github.com/opnsense/src/blob/stable/23.1/sys/dev/axgbe/xgbe-phy-v2.c#LL1746C5-L1746C5). The change would allow administrators to monitor module temperature and voltage (if supported).

The current limitation is that the i2c bus operates at a fairly slow clock speed and thus impacts performance for every ifconfig -v, which in the OPNsense GUI is primarily called to fetch the status of LAGG interfaces. We can improve this by forcing the driver in FAST mode (400khz) as opposed to NORMAL (100khz). However, with the A30 platform coming up, an I/O expander chip has been included which only supports 100khz.

This ticket serves as a description of the problem, at some point we should look into ways of implementing this routine without impacting performance. For now, leave it as is.

svengrun commented 1 year ago

Can implementing this be re-considered? I am using axgbe with BIDI modules and having the optical data is an important debugging aspect that I would really like to have accessible. The performance of the same operation using the same hardware under Linux is fast. Maybe this and the i2c speed can be implemented as a tuneable?

AdSchellevis commented 1 year ago

fast is relative (the bus speed is unrelated to the os), it's still on the wishlist, but does not have a very high priority due to the amount of work needed to fix this properly (we might need to cache values so ifconfig -v responds fast enough or offer the option to enable the data with a tunable, which will likely bite us later as people report a slow[er] gui due to enabling the switch). No easy options unfortunately.

svengrun commented 1 year ago

Just completed performance testing on a DEC850 on Debian 12 to define "fast". The speeds vary by SFP+ module but even the slowest one I found completed the command in 0.160s. Some modules were much faster at around ~ 0.057s but also had less details to output.

See output details of the slowest one below. What times are you seeing?

root@linux:~# time ethtool --module-info enp6s0f5 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Transceiver type : 10G Ethernet: 10G Base-SR Encoding : 0x06 (64B/66B) BR, Nominal : 10300MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 0km Length (SMF) : 0m Length (50um) : 80m Length (62.5um) : 20m Length (Copper) : 0m Length (OM3) : 300m Laser wavelength : 850nm Vendor name : OEM Vendor OUI : 00:90:65 Vendor PN : SFP-10G-SR Vendor rev : 02 Option values : 0x00 0x1a Option : RX_LOS implemented Option : TX_FAULT implemented Option : TX_DISABLE implemented BR margin, max : 0% BR margin, min : 0% Vendor SN : CSG101LB7316 Date code : 211130 Optical diagnostics support : Yes Laser bias current : 6.846 mA Laser output power : 0.6945 mW / -1.58 dBm Receiver signal average optical power : 0.6192 mW / -2.08 dBm Module temperature : 40.56 degrees C / 105.01 degrees F Module voltage : 3.2727 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 15.000 mA Laser bias current low alarm threshold : 1.000 mA Laser bias current high warning threshold : 13.000 mA Laser bias current low warning threshold : 2.000 mA Laser output power high alarm threshold : 2.5118 mW / 4.00 dBm Laser output power low alarm threshold : 0.1258 mW / -9.00 dBm Laser output power high warning threshold : 1.9952 mW / 3.00 dBm Laser output power low warning threshold : 0.1584 mW / -8.00 dBm Module temperature high alarm threshold : 90.00 degrees C / 194.00 degrees F Module temperature low alarm threshold : -10.00 degrees C / 14.00 degrees F Module temperature high warning threshold : 85.00 degrees C / 185.00 degrees F Module temperature low warning threshold : -5.00 degrees C / 23.00 degrees F Module voltage high alarm threshold : 3.6000 V Module voltage low alarm threshold : 2.9000 V Module voltage high warning threshold : 3.5000 V Module voltage low warning threshold : 3.0000 V Laser rx power high alarm threshold : 3.1622 mW / 5.00 dBm Laser rx power low alarm threshold : 0.0199 mW / -17.01 dBm Laser rx power high warning threshold : 1.9952 mW / 3.00 dBm Laser rx power low warning threshold : 0.0316 mW / -15.00 dBm

real 0m0.160s user 0m0.000s sys 0m0.007s root@linux:~#

swhite2 commented 1 year ago

160ms is a huge amount of latency. The reality is that the i2c bus itself has to be shared among all ports, so multiply this amount by the amount of ports, up to 4x, after which you have to consider bus contention between such actions and the link state poll mechanism, as well as error handling.

Again, in terms of priority this requires a lot of effort to ensure stability on production systems.

FrankWeis commented 2 months ago

I am interested in having this. What I miss is some kind of identification of the plugged module, as well as RX/TX values.