sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
173 stars 690 forks source link

[sflow] testDelAgent failed #10291

Open nhe-NV opened 9 months ago

nhe-NV commented 9 months ago

Description The test failed even after the https://github.com/sonic-net/sonic-mgmt/pull/9766/ is merged

Steps to reproduce the issue:

  1. Run the sflow test testDelAgent

Describe the results you received: {"changed": true, "cmd": "//env-python3/bin/ptf --test-dir ptftests/py3 sflow_test --platform-dir ptftests --platform remote -t 'testbed_type='\"'\"'t0'\"'\"';router_mac='\"'\"'9c:05:91:9b:56:00'\"'\"';dst_port=3;agent_id='\"'\"'10.245.20.41'\"'\"';sflow_ports_file='\"'\"'/tmp/sflow_ports.json'\"'\"';polling_int=20;active_collectors=\"['\"'\"'collector0'\"'\"','\"'\"'collector1'\"'\"']\"' --relax --debug info --log-file /tmp/TestAgentId.testDelAgent.log --socket-recv-size 16384", "delta": "0:00:30.819610", "end": "2023-10-10 23:56:54.898380", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2023-10-10 23:56:24.078770", "stderr": "//env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses\n import imp\nsflow_test.SflowTest ... FAIL\n\n======================================================================\nFAIL: sflow_test.SflowTest\n----------------------------------------------------------------------\nTraceback (most recent call last):\n File \"ptftests/py3/sflow_test.py\", line 310, in runTest\n 'collector0', self.poll_tests)\n File \"ptftests/py3/sflow_test.py\", line 189, in packet_analyzer\n data, collector, self.polling_int, port_sample)\n File \"ptftests/py3/sflow_test.py\", line 211, in analyze_counter_sample\n % (self.agent_id, rcvd_agent_id))\nAssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1\n\n----------------------------------------------------------------------\nRan 1 test in 29.349s\n\nFAILED (failures=1)", "stderr_lines": ["/*/env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses", " import imp", "sflow_test.SflowTest ... FAIL", "", "======================================================================", "FAIL: sflow_test.SflowTest", "----------------------------------------------------------------------", "Traceback (most recent call last):", " File \"ptftests/py3/sflow_test.py\", line 310, in runTest", " 'collector0', self.poll_tests)", " File \"ptftests/py3/sflow_test.py\", line 189, in packet_analyzer", " data, collector, self.polling_int, port_sample)", " File \"ptftests/py3/sflow_test.py\", line 211, in analyze_counter_sample", " % (self.agent_id, rcvd_agent_id))", "AssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1", "", "----------------------------------------------------------------------", "Ran 1 test in 29.349s", "", "FAILED (failures=1)"], "stdout": "Using packet manipulation module: ptf.packet_scapy\n\n**\nATTENTION: SOME TESTS DID NOT PASS!!!\n\nThe following tests failed:\nSflowTest\n\n**", "stdout_lines": ["Using packet manipulation module: ptf.packet_scapy", "", "**", "ATTENTION: SOME TESTS DID NOT PASS!!!", "", "The following tests failed:", "SflowTest", "", "**"]}

Describe the results you expected:

Additional information you deem important:

**Output of `show version`:**

```

202305_RC.7-c8447efe1_Internal


    **Attach debug file `sudo generate_dump`:**
(paste your output here)
```
Gokulnath-Raja commented 9 months ago

@nhe-NV kindly share the hsflowd version for this test ?

Gokulnath-Raja commented 9 months ago

After upgrading hsflowd to 2.0.51, post delete of agent id the hsflowd selection of agent id is not deterministic. In our setup we could see mgmt_ip (100.104.62.4) as preferred over loopback (10.1.0.32). Looks like in your setup also agent id is getting selected as different. Kindly confirm in your setup loopback is configured with 20.1.1.1 or some other interface?? @dgsudharsan kindly help here... Looks like agent id is selected randomly based IP addresses configured in the test topology.

dgsudharsan commented 9 months ago

@Gokulnath-Raja Please talk with hsflowd team and understand the algorithm and based on it please update the test logic.

sflow commented 9 months ago

Hello all, the logic for auto-selecting the sflow-agent-address was tweaked in this commit back in November 2022: https://github.com/sflow/host-sflow/commit/458295bcd2c598e5a121e8094eecbce56da126e4

I'm not sure if this is the exact commit that explains what you are seeing, but the address priority defined here: https://github.com/sflow/host-sflow/blob/v2.0.51-26/src/Linux/hsflowd.h#L259-L273 is certainly intending to choose 100.104.62.4 (IPSP_IP4) in preference to 10.1.0.32 (IPSP_IP4_RFC1918).

The reasoning behind this is that the global IP 100.104.62.4 is more likely to be unique and reachable from anywhere. It would be easy to imagine a multi-site network where two switches (perhaps on different LANs) both had 10.1.0.32.

Does this answer the question?

dgsudharsan commented 9 months ago

@Gokulnath-Raja the test fix that was done here https://github.com/sonic-net/sonic-mgmt/pull/9766 needs to be done on a more deterministic approach. @sflow Thanks for your explanation. We need the test to be aligned to chose the next possible IP when we delete the agent. For that the IP chosen needs to be deterministic. Can you please share the entire approach so that the sonic-mgmt test suite can integrate and verify?

mohanapriya-meganathan commented 8 months ago

Hi all, I have done for the analysis to understand the logic behind the selection of agent-id in hsflowd once after the deletion of user configured agent-id. Here is the default agent-id selection logic in hsflowd

  1. If the agentip or agentname is configured in the settings or hardcoded in the configuration file, agent-ip will be choosen from that settings or configuration file.

  2. If it is not falls under 1st condition, we will try to get all(ipv4 and ipv6) the ip interfaces configured and match ipPriority(EnumIPSelectionPriority) appropriate for the interface.

https://github.com/sflow/host-sflow/blob/6296a172c2c3879126298dc66994d38e68956185/src/Linux/hsflowconfig.c#L1019

typedef enum { IPSP_NONE=0,
     IPSP_CLASS_E,
     IPSP_MULTICAST,
     IPSP_LOOPBACK6,
     IPSP_LOOPBACK4,
     IPSP_SELFASSIGNED4,
     IPSP_IP6_SCOPE_LINK,
     IPSP_VLAN6,
     IPSP_VLAN4,
     IPSP_IP6_SCOPE_UNIQUE,
     IPSP_IP6_SCOPE_GLOBAL,
     IPSP_IP4_RFC1918,
     IPSP_IP4,
     IPSP_NUM_PRIORITIES,

} EnumIPSelectionPriority;

  1. Based on the Selection Priority, ip interface having higher priority will be selected as agent-id. https://github.com/sflow/host-sflow/blob/6296a172c2c3879126298dc66994d38e68956185/src/Linux/hsflowconfig.c#L1112,
  2. If two interfaces having same selectionpriority, then we need to get the interface index and adaptor selection priority for the particular interface.
  3. If interface index is same, then first discovered ip is choosen to be the agent-id.
  4. If interface index is different, then the adaptor having lower selection priority is choosen to be the agent-id.
  5. If adaptor priority is same, then interface having lower interface index is choosen to be agent-id.

Its a complex implementation to be done in the test script. Instead we can have a command or any other way to display the selected agent-id from the hsflowd when it chooses for the default agent-id.

sflow commented 8 months ago

Good summary. The chosen agent-address is written as one of the lines in the file /etc/hsflowd.auto (intended for other programs to read if they are going to export application-sFlow samples to the same collector). So I think that might be the easiest place for the test script to find it. Here is an example:


# WARNING: Do not edit this file. It is generated automatically by hsflowd.
rev_start=1
hostname=inmon
sampling=400
header=128
datagram=1400
polling=30
sampling.http=1
collector=127.0.0.1
agentIP=2001:468:1f07:ff1a::106
agent=ens3
ds_index=1
rev_end=1