sflow / host-sflow

host-sflow agent
http://sflow.net
Other
149 stars 55 forks source link

Update_osindex_to_Support_Port_breakout #66

Open mohanapriya-meganathan opened 6 months ago

mohanapriya-meganathan commented 6 months ago

Issue: When the user updates the interface for the dynamic port breakout with global sFlow enable, hsflowd is not updated with the Linux index for the port which is available across the breakout.

Explanation:

sonic(config)# do show interface breakout 1/11 1x100G Completed Ethernet40

After breakout, sonic(config)# do show interface breakout 1/11 4x10G Completed Ethernet40 Ethernet41 Ethernet42 Ethernet43

Index of Ethernet41,42,43 are updated properly as it is updated as new ports. But for the Ethernet40 it is not updated, and it is still with the old Linux index.

Sample UT Log:

dbg1:sonic db_portNamesCB: reply=array(146) dbg1:sonic db_portNamesCB: new port Ethernet41 -> oid:0x1000000000f33 dbg1:sonic db_portNamesCB: new port Ethernet42 -> oid:0x1000000000f5c dbg1:sonic db_portNamesCB: new port Ethernet43 -> oid:0x1000000000f85 dbg1:sonic db_select(6) dbg1:sonic db_select returned 0 dbg1:sonic db_getIfIndexMap() dbg1:sonic db_getIfIndexMap() returned 0 dbg2:mS=128172 agentCB_sendPkt() sending datagram: 112 dbg1:sonic db_selectCB: reply=status(0)="OK" dbg1:sonic db_ifIndexMapCB: reply=array(4) dbg1:sonic db_ifIndexMapCB: index=string(2)="44" dbg1:sonic db_ifIndexMapCB: ifindex=string(3)="985" dbg1:setAdaptorAlias(MOD_SONIC): NULL alias == Ethernet43 (changed=YES) dbg1:setAdaptorSelectionPriority(MOD_SONIC): Ethernet43 0 -> 44 (changed=YES) dbg1:sonic db_getIfIndexMap() dbg1:sonic db_getIfIndexMap() returned 0 dbg1:sonic db_ifIndexMapCB: reply=array(4) dbg1:sonic db_ifIndexMapCB: index=string(2)="43" dbg1:sonic db_ifIndexMapCB: ifindex=string(3)="984" dbg1:setAdaptorAlias(MOD_SONIC): NULL alias == Ethernet42 (changed=YES) dbg1:setAdaptorSelectionPriority(MOD_SONIC): Ethernet42 0 -> 43 (changed=YES) dbg1:sonic db_getIfIndexMap() dbg1:sonic db_getIfIndexMap() returned 0 dbg1:sonic db_ifIndexMapCB: reply=array(4) dbg1:sonic db_ifIndexMapCB: index=string(2)="42" dbg1:sonic db_ifIndexMapCB: ifindex=string(3)="983" dbg1:setAdaptorAlias(MOD_SONIC): NULL alias == Ethernet41 (changed=YES) dbg1:setAdaptorSelectionPriority(MOD_SONIC): Ethernet41 0 -> 42 (changed=YES) dbg1:sonic db_getIfIndexMap() dbg1:sonic db_getIfIndexMap() returned 0 dbg1:sonic db_ifIndexMapCB: reply=array(4) dbg1:sonic db_ifIndexMapCB: index=string(2)="41" dbg1:sonic db_ifIndexMapCB: ifindex=string(3)="982" dbg1:setAdaptorAlias(MOD_SONIC): NULL alias == Ethernet40 (changed=YES) dbg1:setAdaptorSelectionPriority(MOD_SONIC): Ethernet40 0 -> 41 (changed=YES)

sflow commented 6 months ago

Thanks!

However I think it's better to just request that the port osIndex be remapped. I made this change in a new branch called "v2.0" and tagged a release 2.0.56-1 from there so you can try it. What I did is almost the same as your commit. But by leaving the old ifIndex in place the db_getIfIndexCB() function sees the change from old-value to new-value. That way it gets to run this cleanup code: https://github.com/sflow/host-sflow/blob/v2.0/src/Linux/mod_sonic.c#L995-L1005

Make sense?

Please let me know if version 2.0.56-1 works for this case, so I can merge this change into the master branch too.

mohanapriya-meganathan commented 6 months ago

Hi Team, I have seen error in the log while testing with your code,

dbg1:sonic db_portStateCB: Ethernet40 adaptor lookup failed (osIndex=259)
dbg1:sonic db_portStateCB: Ethernet40((null)) marked as switchPort

I have added the debug statements here with your changes,

            setStr(&prt->oid, p_oid->str);
+            requestPortIfIndexDiscovery(mod, prt);
            signalCounterDiscontinuity(mod, prt);
+           myDebug(1, "sonic db_portNamesCB: OID Changed port %s -> %s", prt->portName, prt->oid);
          }
          if(prt->osIndex == HSP_SONIC_IFINDEX_UNDEFINED) {
-            if(prt->unmappedPort == NO) {
-             // queue it for ifIndex discovery
-             UTArrayPush(mdata->unmappedPorts, prt);
-             prt->unmappedPort = YES;
-           }
+           // queue it for ifIndex discovery
+           requestPortIfIndexDiscovery(mod, prt);
+           myDebug(1, "sonic db_portNamesCB: IfIndex discovery port %s -> %s", prt->portName, prt->oid);
          }

Attached the snippet of hsflowd log for more information.

hsflowd_issue.txt

@Gokulnath-Raja, Thanks for your guidance.

sflow commented 6 months ago

The underlying mechanism for discovering Linux interfaces (known as adaptors) only runs periodically, usually every minute. That happens in readInterfaces.c. So at the point where the interface breakout command has triggered the appearance of these new interfaces in redis, hsflowd may not have discovered the corresponding adaptors yet. The HSPSonicPort objects are still created in mod_sonic, but the rendez-vous is not completed until the adaptor is discovered.

So if I have understood correctly, this log message that you saw:

dbg1:sonic db_portStateCB: Ethernet40((null)) marked as switchPort

happened when the new adaptor (with Linux ifIndex 259) was eventually discovered.

But can you confirm that sFlow sampling/polling for ifIndex 259 was seen in the output? That would be proof that it worked OK. I might make another change just to make sure that the underlying SFLAdaptor is definitely marked as a switchPort in this sequence. I am concerned that there may be a corner-case where that doesn't happen. The HSPSonicPort should remember if it has not yet told its underlying adaptor that it is a switchPort...

Thank you so much for your help with this.

mohanapriya-meganathan commented 6 months ago

Yes, I am able to see the sampling with the new Linux ifindex (259) after port breakout.

dbg2:selected sampler Ethernet40 ifIndex=259
dbg1:psample netlink (type=34) CMD = 0
dbg3:psample: grp=1
dbg2:psample: grp=1 in=259 out=260 n=10000 seq=4996 drops=0 pktlen=128
takeSample: hook=0 tap=Ethernet40 in=Ethernet40 out=Ethernet41 pkt_len=114 cap_len=114 mac_len=14 (00008593D387 -> 185A581FEFA3 et=0x8100)
dbg2:selected sampler Ethernet40 ifIndex=259
dbg2:mS=61052 agentCB_sendPkt() sending datagram: 67
sflow commented 6 months ago

Thanks. I still think there is a corner case that needs attention: I think counter-samples from ifIndex=259 would not have appeared until the next packet-sample to/from that port. For now I'll address that in the master branch and keep changes in the v2.0 branch to a minimum.

sflow commented 6 months ago

Can I ask you to test this against the master branch too? If you need there to be a tag or release then let me know. It would be helpful to know that it handles the interface breakout gracefully with the hardware setup that you have.