kytos-ng / flow_manager

Kytos NApp that manages OpenFlow 1.3 entries
https://kytos-ng.github.io/api/flow_manager.html
MIT License
0 stars 7 forks source link

bug: threaded delayed barrier reply flow confirmation can end up setting a deleted flows as installed #208

Closed viniarck closed 1 week ago

viniarck commented 2 weeks ago

This is the root cause of issue https://github.com/kytos-ng/mef_eline/issues/549

In summary, the threaded handler on_ofpt_barrier_reply might end up getting delayed or preempted, which can happen if the barrier reply is a bit slower than usual or general thread scheduling, and then if a the same flow gets deleted before it finishes, then it would incorrectly set the flow as installed again, which ultimately would make the consistency check to try to install this flow again, which would be considered a garbage/orphan flow.

Here's how to reproduce:

kytos $> 2024-10-31 08:55:09,714 - INFO [kytos.napps.kytos/flow_manager] (AnyIO worker thread) Send FlowMod from request dpid: 00:00:00:00:00:00:00:01, command: add, force: False, total_
length: 1,  flows[0, 1]: [{'match': {'in_port': 1, 'dl_vlan': 200}, 'instructions': [{'instruction_type': 'apply_actions', 'actions': [{'action_type': 'output', 'port': 2}]}]}]
2024-10-31 08:55:09,721 - INFO [uvicorn.access] (MainThread) 127.0.0.1:39196 - "POST /api/kytos/flow_manager/v2/flows/00%3A00%3A00%3A00%3A00%3A00%3A00%3A01 HTTP/1.1" 202
2024-10-31 08:55:09,728 - INFO [kytos.napps.kytos/flow_manager] (AnyIO worker thread) Send FlowMod from request dpid: 00:00:00:00:00:00:00:01, command: delete, force: False, total_length
: 1,  flows[0, 1]: [{'match': {'in_port': 1, 'dl_vlan': 200}, 'instructions': [{'instruction_type': 'apply_actions', 'actions': [{'action_type': 'output', 'port': 2}]}]}]
2024-10-31 08:55:09,738 - INFO [uvicorn.access] (MainThread) 127.0.0.1:39212 - "DELETE /api/kytos/flow_manager/v2/flows/00%3A00%3A00%3A00%3A00%3A00%3A00%3A01 HTTP/1.1" 202
kytos $> 2024-10-31 08:57:31,467 - INFO [kytos.napps.kytos/flow_manager] (thread_pool_sb_5) Consistency check: missing 1 flows on switch 00:00:00:00:00:00:00:01.
2024-10-31 08:57:31,469 - INFO [kytos.napps.kytos/flow_manager] (thread_pool_sb_5) Flows forwarded to switch 00:00:00:00:00:00:00:01 to be installed. total_length: 1,  flows[0, 1]: [{'ta
ble_id': 0, 'table_group': 'base', 'priority': 32768, 'cookie': 0, 'idle_timeout': 0, 'hard_timeout': 0, 'match': {'in_port': 1, 'dl_vlan': 200}, 'instructions': [{'instruction_type': 'a
pply_actions', 'actions': [{'action_type': 'output', 'port': 2}]}]}]
rs0 [direct: primary] napps> db.flows.find({"switch": "00:00:00:00:00:00:00:01", "flow.match.in_port": 1, "flow.match.dl_vlan": 200})
[
  {
    _id: 'f7d62685768a4072275ac13c54f12c14',
    flow: {
      table_id: 0,
      table_group: 'base',
      priority: 32768,
      cookie: Decimal128("0"),
      idle_timeout: 0,
      hard_timeout: 0,
      match: { in_port: 1, dl_vlan: 200 },
      instructions: [
        {
          instruction_type: 'apply_actions',
          actions: [ { action_type: 'output', port: 2 } ]
        }
      ]
    },
    flow_id: '085b6b19a3d68ed4066bb67aef69f0cf',
    id: 'f7d62685768a4072275ac13c54f12c14',
    inserted_at: ISODate("2024-10-31T11:55:09.716Z"),
    state: 'installed',
    switch: '00:00:00:00:00:00:00:01',
    updated_at: ISODate("2024-10-31T11:55:14.725Z")
  }
]
rs0 [direct: primary] napps> 

How to fix

Similarly to how the handler which handles kytos/of_core.flow_stats.received, only sets flows as installed if they were pending, the barrier reply handler should follow the same approach, that way if it has been deleted it won't be incorrectly updated.

viniarck commented 1 week ago

Landed on https://github.com/kytos-ng/flow_manager/pull/211