dentproject / dentOS

dentOS SwitchDev based NOS
Other
203 stars 59 forks source link

Non-working GBICs #265

Open paulmenzel opened 1 year ago

paulmenzel commented 1 year ago

This issue is a meta issue to collect and document non-working GBICs. Please create a separate issue for GBIC and reference it here in a comment. The admins are going to update the issue description. Please sort it.

  1. Intel Corp FTLX8571D3BCV-IT rev A (Edgecore AS5114-48X) (issue https://github.com/dentproject/dentOS/issues/264)
  2. OPNEXT TRS5020EN-S301 (Edgecore AS5114-48X) (issue https://github.com/dentproject/dentOS/issues/262)

Tested and working

  1. module Intel Corp AFBR-703SDZ-IN2 rev G2.3 sn AA1329A5UTA dc 130718 (Edgecore AS5114-48X)
KanjiMonster commented 1 year ago

Just a wild guess, since both are optical modules, can you check if changing the TX_DISABLE setting changes anything (should be exposed as a sysfs value by the cpld driver)?

In my experience copper modules tend to not implement it, while optical modules tend to support it (and won't link if the signal asserted).

Also make sure that this is true for the other side, whereever it is plugged in.

paulmenzel commented 1 year ago

A quick look didn’t reveal such a control:

# ls -lR /sys/module/arm64_accton_as4224_cpld
/sys/module/arm64_accton_as4224_cpld:
total 0
-r--r--r-- 1 root root 4096 Aug 31 14:12 coresize
drwxr-xr-x 2 root root    0 Aug 31 14:11 drivers
drwxr-xr-x 2 root root    0 Aug 31 14:11 holders
-r--r--r-- 1 root root 4096 Aug 31 14:13 initsize
-r--r--r-- 1 root root 4096 Jul 16 12:08 initstate
drwxr-xr-x 2 root root    0 Aug 31 14:11 notes
-r--r--r-- 1 root root 4096 Aug 31 14:12 refcnt
drwxr-xr-x 2 root root    0 Aug 31 14:11 sections
-r--r--r-- 1 root root 4096 Aug 31 14:13 taint
--w------- 1 root root 4096 Jul 16 12:08 uevent

/sys/module/arm64_accton_as4224_cpld/drivers:
total 0
lrwxrwxrwx 1 root root 0 Aug 31 14:13 i2c:as4224_cpld -> ../../../bus/i2c/drivers/as4224_cpld

/sys/module/arm64_accton_as4224_cpld/holders:
total 0
lrwxrwxrwx 1 root root 0 Aug 31 14:13 arm64_accton_as4224_fan -> ../../arm64_accton_as4224_fan
lrwxrwxrwx 1 root root 0 Aug 31 14:13 arm64_accton_as4224_gpio_i2c -> ../../arm64_accton_as4224_gpio_i2c
lrwxrwxrwx 1 root root 0 Aug 31 14:13 arm64_accton_as4224_psu -> ../../arm64_accton_as4224_psu

/sys/module/arm64_accton_as4224_cpld/notes:
total 0

/sys/module/arm64_accton_as4224_cpld/sections:
total 0
-r-------- 1 root root 19 Aug 31 14:13 __dyndbg
-r-------- 1 root root 19 Aug 31 14:13 __jump_table
-r-------- 1 root root 19 Aug 31 14:13 __kcrctab
-r-------- 1 root root 19 Aug 31 14:13 __ksymtab
-r-------- 1 root root 19 Aug 31 14:13 __ksymtab_strings
-r-------- 1 root root 19 Aug 31 14:13 __mcount_loc
KanjiMonster commented 1 year ago

You need to to follow the drivers/i2c:as4224_cpld symlink, then you should find a subdirectory there X-YYYY for the bus/address where the driver is bound, and in that directory you should find a lot of file attributes you can read and write AFAICT from arm64-accton-as4224-cpld.

paulmenzel commented 1 year ago

Thank you.

 # grep . /sys/module/arm64_accton_as4224_cpld/drivers/i2c\:as4224_cpld/0-0040/*18*
 /sys/module/arm64_accton_as4224_cpld/drivers/i2c:as4224_cpld/0-0040/module_present_18:1
 /sys/module/arm64_accton_as4224_cpld/drivers/i2c:as4224_cpld/0-0040/module_rx_los_18:1
 /sys/module/arm64_accton_as4224_cpld/drivers/i2c:as4224_cpld/0-0040/module_tx_disable_18:0
 /sys/module/arm64_accton_as4224_cpld/drivers/i2c:as4224_cpld/0-0040/module_tx_fault_18:0

I am going to try your suggestion next week.

taraschornyiplv commented 1 year ago

@paulmenzel are you using the latest CPLD image you have reported in #228 ?

paulmenzel commented 1 year ago

Yes, 1.09.

root@ec-as5114-48x-03:~# i2cget -f 0 0x40 0x01 b
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will read from device file /dev/i2c-0, chip address 0x40, data address
0x01, using read byte data.
Continue? [Y/n] Y
0x01
root@ec-as5114-48x-03:~# i2cget -f 0 0x40 0xff b
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will read from device file /dev/i2c-0, chip address 0x40, data address
0xff, using read byte data.
Continue? [Y/n] 
0x09
taraschornyiplv commented 1 year ago

It looks like rx loss is up. When rx_los is up port will be in down state

paulmenzel commented 1 year ago

It looks like rx loss is up. When rx_los is up port will be in down state

Excuse my ignorance. Does that mean is a CPLD firmware issue?

taraschornyiplv commented 1 year ago

It looks like rx loss is up. When rx_los is up port will be in down state

Excuse my ignorance. Does that mean is a CPLD firmware issue?

i do not think so

paulmenzel commented 1 year ago

i do not think so

;-)

Where do you think, the problem is?

KanjiMonster commented 1 year ago

Btw, where did you connect the other side? Is this an AS5114-48X as well? If not, can you check what happens when you cross-connect two ports on your AS5114-48X?

taraschornyiplv commented 1 year ago

i do not think so

;-)

Where do you think, the problem is?

it might be that SFP module itself has this pin pulled up unconditionally. I've seen it several times. also, pinn can be inverted in some modules.

I'd recommend connecting 2 of these modules in a loop back to a switch. and check rx_loss. then unplug one and check rx_los again

KanjiMonster commented 1 year ago

i do not think so

;-) Where do you think, the problem is?

it might be that SFP module itself has this pin pulled up unconditionally. I've seen it several times. also, pinn can be inverted in some modules.

At least according to their EEPROMs they aren't:

0040: 00 3a

and

0040: 00 1a

?a = b'....1010 SFF-8472 says:

A0h Bit Description
65 5 RATE_SELECT functionality is implemented
NOTE: Lack of implementation does not indicate lack of simultaneous compliance with multiple standard rates. Compliance with particular standards should be determined from Transceiver Code Section (Table 5-3). Refer to Table 5-6 for Rate_Select functionality type identifiers.
4 TX_DISABLE is implemented and disables the high speed serial output.
3 TX_FAULT signal implemented. (See SFF-8419)
2 Loss Loss of Signal implemented, signal inverted from standard definition in SFP MSA (often called "Signal Detect").
NOTE: This is not standard SFP/GBIC behavior and should be avoided, since non-interoperable behavior results.
1 Loss of Signal implemented, behavior as defined in SFF-8419 (often called "Rx_LOS").
0 Reserved

So they both (claim to) implement TX_DISABLE, TX_FAULT, RX_LOS with standard behavior, and the intel module supports in addition RATE_SELECT.