Open nhe-NV opened 9 months ago
@nhe-NV kindly share the hsflowd version for this test ?
After upgrading hsflowd to 2.0.51, post delete of agent id the hsflowd selection of agent id is not deterministic. In our setup we could see mgmt_ip (100.104.62.4) as preferred over loopback (10.1.0.32). Looks like in your setup also agent id is getting selected as different. Kindly confirm in your setup loopback is configured with 20.1.1.1 or some other interface?? @dgsudharsan kindly help here... Looks like agent id is selected randomly based IP addresses configured in the test topology.
@Gokulnath-Raja Please talk with hsflowd team and understand the algorithm and based on it please update the test logic.
Hello all, the logic for auto-selecting the sflow-agent-address was tweaked in this commit back in November 2022: https://github.com/sflow/host-sflow/commit/458295bcd2c598e5a121e8094eecbce56da126e4
I'm not sure if this is the exact commit that explains what you are seeing, but the address priority defined here: https://github.com/sflow/host-sflow/blob/v2.0.51-26/src/Linux/hsflowd.h#L259-L273 is certainly intending to choose 100.104.62.4 (IPSP_IP4) in preference to 10.1.0.32 (IPSP_IP4_RFC1918).
The reasoning behind this is that the global IP 100.104.62.4 is more likely to be unique and reachable from anywhere. It would be easy to imagine a multi-site network where two switches (perhaps on different LANs) both had 10.1.0.32.
Does this answer the question?
@Gokulnath-Raja the test fix that was done here https://github.com/sonic-net/sonic-mgmt/pull/9766 needs to be done on a more deterministic approach. @sflow Thanks for your explanation. We need the test to be aligned to chose the next possible IP when we delete the agent. For that the IP chosen needs to be deterministic. Can you please share the entire approach so that the sonic-mgmt test suite can integrate and verify?
Hi all, I have done for the analysis to understand the logic behind the selection of agent-id in hsflowd once after the deletion of user configured agent-id. Here is the default agent-id selection logic in hsflowd
If the agentip or agentname is configured in the settings or hardcoded in the configuration file, agent-ip will be choosen from that settings or configuration file.
If it is not falls under 1st condition, we will try to get all(ipv4 and ipv6) the ip interfaces configured and match ipPriority(EnumIPSelectionPriority) appropriate for the interface.
typedef enum { IPSP_NONE=0,
IPSP_CLASS_E,
IPSP_MULTICAST,
IPSP_LOOPBACK6,
IPSP_LOOPBACK4,
IPSP_SELFASSIGNED4,
IPSP_IP6_SCOPE_LINK,
IPSP_VLAN6,
IPSP_VLAN4,
IPSP_IP6_SCOPE_UNIQUE,
IPSP_IP6_SCOPE_GLOBAL,
IPSP_IP4_RFC1918,
IPSP_IP4,
IPSP_NUM_PRIORITIES,
} EnumIPSelectionPriority;
Its a complex implementation to be done in the test script. Instead we can have a command or any other way to display the selected agent-id from the hsflowd when it chooses for the default agent-id.
Good summary. The chosen agent-address is written as one of the lines in the file /etc/hsflowd.auto (intended for other programs to read if they are going to export application-sFlow samples to the same collector). So I think that might be the easiest place for the test script to find it. Here is an example:
# WARNING: Do not edit this file. It is generated automatically by hsflowd.
rev_start=1
hostname=inmon
sampling=400
header=128
datagram=1400
polling=30
sampling.http=1
collector=127.0.0.1
agentIP=2001:468:1f07:ff1a::106
agent=ens3
ds_index=1
rev_end=1
Description The test failed even after the https://github.com/sonic-net/sonic-mgmt/pull/9766/ is merged
Steps to reproduce the issue:
Describe the results you received: {"changed": true, "cmd": "//env-python3/bin/ptf --test-dir ptftests/py3 sflow_test --platform-dir ptftests --platform remote -t 'testbed_type='\"'\"'t0'\"'\"';router_mac='\"'\"'9c:05:91:9b:56:00'\"'\"';dst_port=3;agent_id='\"'\"'10.245.20.41'\"'\"';sflow_ports_file='\"'\"'/tmp/sflow_ports.json'\"'\"';polling_int=20;active_collectors=\"['\"'\"'collector0'\"'\"','\"'\"'collector1'\"'\"']\"' --relax --debug info --log-file /tmp/TestAgentId.testDelAgent.log --socket-recv-size 16384", "delta": "0:00:30.819610", "end": "2023-10-10 23:56:54.898380", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2023-10-10 23:56:24.078770", "stderr": "//env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses\n import imp\nsflow_test.SflowTest ... FAIL\n\n======================================================================\nFAIL: sflow_test.SflowTest\n----------------------------------------------------------------------\nTraceback (most recent call last):\n File \"ptftests/py3/sflow_test.py\", line 310, in runTest\n 'collector0', self.poll_tests)\n File \"ptftests/py3/sflow_test.py\", line 189, in packet_analyzer\n data, collector, self.polling_int, port_sample)\n File \"ptftests/py3/sflow_test.py\", line 211, in analyze_counter_sample\n % (self.agent_id, rcvd_agent_id))\nAssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1\n\n----------------------------------------------------------------------\nRan 1 test in 29.349s\n\nFAILED (failures=1)", "stderr_lines": ["/*/env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses", " import imp", "sflow_test.SflowTest ... FAIL", "", "======================================================================", "FAIL: sflow_test.SflowTest", "----------------------------------------------------------------------", "Traceback (most recent call last):", " File \"ptftests/py3/sflow_test.py\", line 310, in runTest", " 'collector0', self.poll_tests)", " File \"ptftests/py3/sflow_test.py\", line 189, in packet_analyzer", " data, collector, self.polling_int, port_sample)", " File \"ptftests/py3/sflow_test.py\", line 211, in analyze_counter_sample", " % (self.agent_id, rcvd_agent_id))", "AssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1", "", "----------------------------------------------------------------------", "Ran 1 test in 29.349s", "", "FAILED (failures=1)"], "stdout": "Using packet manipulation module: ptf.packet_scapy\n\n**\nATTENTION: SOME TESTS DID NOT PASS!!!\n\nThe following tests failed:\nSflowTest\n\n**", "stdout_lines": ["Using packet manipulation module: ptf.packet_scapy", "", "**", "ATTENTION: SOME TESTS DID NOT PASS!!!", "", "The following tests failed:", "SflowTest", "", "**"]}
Describe the results you expected:
Additional information you deem important:
202305_RC.7-c8447efe1_Internal