Closed italovalcy closed 3 years ago
Hi folks, just some additional information:
Hi @italovalcy, I'll check why that happens
Hi @cmagnobarbosa and folks,
I think I narrowed down why this is happening. The consistency routine in the way it is implemented today depends on the FLOW_STATS reply to be able to check the stored flows against the switch's flow table. The FLOW_STATS is run by of_core and has nothing to do with flow_manager consistency check, which means that it can be run in a totally different frequency. Suppose the flow_stats run less frequently than consistency_check. The problem then happens because consistency_check compares with an empty array for the switch's flow table and then send a FLOW_MOD to install the apparently missing flows (they are not actually missing, but they were just not seen by of_core at that point).
In fact, the FLOW_STATS is run according to the setting of_core.settings.STATS_INTERVAL and consistency check is run based on flow_manager.settings.CONSISTENCY_INTERVAL.
One way to work around this issue is by changing the of_core.settings.STATS_INTERVAL to something lower than CONSISTENCY_INTERVAL (keeping in mind that the consistency interval could be run right after or right before flow_stats)...
In terms of a proposal for this issue, my suggestions would be:
Hello @italovalcy, I agree with you, I'll implement a new event in the of_core
(https://github.com/kytos/of_core/issues/122) NApp to notify interested NApps when new flow statistics are available, making it possible to improve and fix the consistency check.
Hi folks,
Another thing is the way flow_manager persists the flows: looks like it has a list of flows and the flow_mod command (add, delete, delete_strict). This is dangerous because then flow_manager will have to guarantee the order in which the commands should be run and looks like it is not the case (see below).
I've created some debug messages to understand how consistency check works:
diff --git a/main.py b/main.py
index c737f85..ab86ff4 100644
--- a/main.py
+++ b/main.py
@@ -88,6 +98,8 @@ class Main(KytosNApp):
"""Check consistency of installed flows for a specific switch."""
dpid = switch.dpid
+ log.info("ITALO-DEBUG check_switch_consistency dpid=%s stored_flows=%s switch_flowtable=%s" % (dpid, self.stored_flows.get(d
pid, None), switch.flows))
+
# Flows stored in storehouse
stored_flows = self.stored_flows[dpid]['flow_list']
@@ -117,6 +129,7 @@ class Main(KytosNApp):
"""Check consistency of installed flows for a specific switch."""
dpid = switch.dpid
+ log.info("ITALO-DEBUG check_storehouse_consistency dpid=%s stored_flows=%s switch_flowtable=%s" % (dpid, self.stored_flows.g
et(dpid, None), switch.flows))
for installed_flow in switch.flows:
if dpid not in self.stored_flows:
log.info('A consistency problem was detected in '
Then one of the messages shows the following information:
2021-04-23 17:13:12,159 - INFO [kytos.napps.kytos/flow_manager] (Thread-178) ITALO-DEBUG check_switch_consistency dpid=00:00:00:00:00:00:00:01 stored_flows={'flow_list': [{'command': 'add', 'flow': {'priority': 1000, 'table_id': 0, 'match': {'dl_type': 35020, 'dl_vlan': 3799}, 'actions': [{'action_type': 'output', 'port': 4294967293}]}}, {'command': 'delete', 'flow': {'cookie': 12280154957751673672, 'cookie_mask': 18446744073709551615}}, {'command': 'add', 'flow': {'match': {'in_port': 1, 'dl_vlan': 101}, 'cookie': 12280154957751673672, 'actions': [{'action_type': 'set_vlan', 'vlan_id': 101}, {'action_type': 'output', 'port': 2}]}}, {'command': 'add', 'flow': {'match': {'in_port': 2, 'dl_vlan': 101}, 'cookie': 12280154957751673672, 'actions': [{'action_type': 'set_vlan', 'vlan_id': 101}, {'action_type': 'output', 'port': 1}]}}]} switch_flowtable=[<napps.kytos.of_core.v0x04.flow.Flow object at 0x7fb51427d438>]
As you can see in the "stored_flows" there is a list of commands [add, delete, add, add]. Those commands/flows were created by the following sequence of events: start kytos (so the LLDP will be created), create an intra-switch EVC (so the two add flows), and then disable the EVC. The order in the stored_flows doesn't match with the sequence of events. Indeed, after disabling the EVC, the flows are recreated:
mininet> sh while true; do ovs-ofctl dump-flows s1; echo; sleep 1; done
...
cookie=0xaa6bdfdefb8a6348, duration=47.632s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth1",dl_vlan=101 actions=mod_vlan_vid:101,output:"s
1-eth2"
cookie=0xaa6bdfdefb8a6348, duration=47.619s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth2",dl_vlan=101 actions=mod_vlan_vid:101,output:"s
1-eth1"
cookie=0x0, duration=81.748s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
cookie=0x0, duration=82.752s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
cookie=0xaa6bdfdefb8a6348, duration=0.702s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth1",dl_vlan=101 actions=mod_vlan_vid:101,output:"s1
-eth2"
cookie=0xaa6bdfdefb8a6348, duration=0.688s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth2",dl_vlan=101 actions=mod_vlan_vid:101,output:"s1
-eth1"
cookie=0x0, duration=83.756s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
Hi folks,
Another thing is the way flow_manager persists the flows: looks like it has a list of flows and the flow_mod command (add, delete, delete_strict). This is dangerous because then flow_manager will have to guarantee the order in which the commands should be run and looks like it is not the case (see below).
I've created some debug messages to understand how consistency check works:
diff --git a/main.py b/main.py index c737f85..ab86ff4 100644 --- a/main.py +++ b/main.py @@ -88,6 +98,8 @@ class Main(KytosNApp): """Check consistency of installed flows for a specific switch.""" dpid = switch.dpid + log.info("ITALO-DEBUG check_switch_consistency dpid=%s stored_flows=%s switch_flowtable=%s" % (dpid, self.stored_flows.get(d pid, None), switch.flows)) + # Flows stored in storehouse stored_flows = self.stored_flows[dpid]['flow_list'] @@ -117,6 +129,7 @@ class Main(KytosNApp): """Check consistency of installed flows for a specific switch.""" dpid = switch.dpid + log.info("ITALO-DEBUG check_storehouse_consistency dpid=%s stored_flows=%s switch_flowtable=%s" % (dpid, self.stored_flows.g et(dpid, None), switch.flows)) for installed_flow in switch.flows: if dpid not in self.stored_flows: log.info('A consistency problem was detected in '
Then one of the messages shows the following information:
2021-04-23 17:13:12,159 - INFO [kytos.napps.kytos/flow_manager] (Thread-178) ITALO-DEBUG check_switch_consistency dpid=00:00:00:00:00:00:00:01 stored_flows={'flow_list': [{'command': 'add', 'flow': {'priority': 1000, 'table_id': 0, 'match': {'dl_type': 35020, 'dl_vlan': 3799}, 'actions': [{'action_type': 'output', 'port': 4294967293}]}}, {'command': 'delete', 'flow': {'cookie': 12280154957751673672, 'cookie_mask': 18446744073709551615}}, {'command': 'add', 'flow': {'match': {'in_port': 1, 'dl_vlan': 101}, 'cookie': 12280154957751673672, 'actions': [{'action_type': 'set_vlan', 'vlan_id': 101}, {'action_type': 'output', 'port': 2}]}}, {'command': 'add', 'flow': {'match': {'in_port': 2, 'dl_vlan': 101}, 'cookie': 12280154957751673672, 'actions': [{'action_type': 'set_vlan', 'vlan_id': 101}, {'action_type': 'output', 'port': 1}]}}]} switch_flowtable=[<napps.kytos.of_core.v0x04.flow.Flow object at 0x7fb51427d438>]
As you can see in the "stored_flows" there is a list of commands [add, delete, add, add]. Those commands/flows were created by the following sequence of events: start kytos (so the LLDP will be created), create an intra-switch EVC (so the two add flows), and then disable the EVC. The order in the stored_flows doesn't match with the sequence of events. Indeed, after disabling the EVC, the flows are recreated:
mininet> sh while true; do ovs-ofctl dump-flows s1; echo; sleep 1; done ... cookie=0xaa6bdfdefb8a6348, duration=47.632s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth1",dl_vlan=101 actions=mod_vlan_vid:101,output:"s 1-eth2" cookie=0xaa6bdfdefb8a6348, duration=47.619s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth2",dl_vlan=101 actions=mod_vlan_vid:101,output:"s 1-eth1" cookie=0x0, duration=81.748s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535 cookie=0x0, duration=82.752s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535 cookie=0xaa6bdfdefb8a6348, duration=0.702s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth1",dl_vlan=101 actions=mod_vlan_vid:101,output:"s1 -eth2" cookie=0xaa6bdfdefb8a6348, duration=0.688s, table=0, n_packets=0, n_bytes=0, in_port="s1-eth2",dl_vlan=101 actions=mod_vlan_vid:101,output:"s1 -eth1" cookie=0x0, duration=83.756s, table=0, n_packets=0, n_bytes=0, priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535
@cmagnobarbosa I was working deeper with this and I realized it is related to something else: archiving the EVC through PATCH and not through the DELETE method. So please, forget about the above comment. I will register a new issue in mef_eline for this.
Hi,
The consistency check keeps inserting and removing flows on a switch, even in the simplest running scenario: one switch, only OF_LLDP flows. This seems to happen in the first N runs (from my tests, in the first 5 runs), then it gets "stable."
How to reproduce:
Expected output: the "duration" of the LLDP flows should be only increasing.
Actual output: the flows are recreated during the first N runs and then gets "stable", i.e., the duration start increasing:
From Kytos logs:
This behavior is not correct because it means that the network traffic will be unstable (flapping) every time the controller gets restarted.