kytos-ng / sdntrace_cp

MIT License
1 stars 6 forks source link

Invalid trace when having EPL EVC in a NNI #102

Closed italovalcy closed 1 year ago

italovalcy commented 1 year ago

Hi,

When we have an Ethernet Private Line EVC in a NNI, and we also have some EVPL (on UNIs), sdntrace-cp gets confused and gives wrong result.. This probably has to do with the fact the EPL and EVPL have different priorities no mef_eline. I'm not sure sdntrace-cp deals well with priorities (it looks like it does not).

How to reproduce:

  1. Run kytos with latest version and Mininet with linear topology (4 switfhes)

  2. Create an EPL EVC:

    curl -H 'Content-type: application/json' -X POST http://127.0.0.1:8181/api/kytos/mef_eline/v2/evc/ -d '{"name": "pw_s2", "uni_a": {"interface_id": "00:00:00:00:00:00:00:02:2"}, "uni_z": { "interface_id": "00:00:00:00:00:00:00:02:3"}}'
  3. Create an EVPL EVC:

    curl -H 'Content-type: application/json' -X POST http://127.0.0.1:8181/api/kytos/mef_eline/v2/evc/ -d '{"name": "evc-vlan-999", "dynamic_backup_path": true,  "uni_z": {"tag": {"value": 999, "tag_type": 1}, "interface_id": "00:00:00:00:00:00:00:04:1"}, "uni_a": {"tag": {"value": 999, "tag_type": 1}, "interface_id": "00:00:00:00:00:00:00:02:1"}}'
  4. Run a trace from UNI A and UNI Z for the EVPL EVC:

    curl -H 'Content-type: application/json' -X PUT http://127.0.0.1:8181/api/amlight/sdntrace_cp/v1/trace -d '{"trace": {"switch": {"dpid": "00:00:00:00:00:00:00:02", "in_port": 1}, "eth": {"dl_type": 33024, "dl_vlan": 999}}}'
    curl -H 'Content-type: application/json' -X PUT http://127.0.0.1:8181/api/amlight/sdntrace_cp/v1/trace -d '{"trace": {"switch": {"dpid": "00:00:00:00:00:00:00:04", "in_port": 1}, "eth": {"dl_type": 33024, "dl_vlan": 999}}}'

Expected result: when running from UNI_A we should end on UNI_Z and vice-versa.

Actual result:

root@be420d5eb226:/# curl -H 'Content-type: application/json' -X PUT http://127.0.0.1:8181/api/amlight/sdntrace_cp/v1/trace -d '{"trace": {"switch": {"dpid": "00:00:00:00:00:00:00:02", "in_port": 1}, "eth": {"dl_type": 33024, "dl_vlan": 999}}}'
{"result":[{"dpid":"00:00:00:00:00:00:00:02","port":1,"time":"2023-06-30 04:59:57.240037","type":"starting","vlan":999},{"dpid":"00:00:00:00:00:00:00:03","port":2,"time":"2023-06-30 04:59:57.240094","type":"intermediary","vlan":1},{"dpid":"00:00:00:00:00:00:00:04","port":2,"time":"2023-06-30 04:59:57.240122","type":"last","vlan":1,"out":{"port":1,"vlan":999}}]}
root@be420d5eb226:/#
root@be420d5eb226:/# curl -H 'Content-type: application/json' -X PUT http://127.0.0.1:8181/api/amlight/sdntrace_cp/v1/trace -d '{"trace": {"switch": {"dpid": "00:00:00:00:00:00:00:04", "in_port": 1}, "eth": {"dl_type": 33024, "dl_vlan": 999}}}'
{"result":[{"dpid":"00:00:00:00:00:00:00:04","port":1,"time":"2023-06-30 05:00:21.091026","type":"starting","vlan":999},{"dpid":"00:00:00:00:00:00:00:03","port":3,"time":"2023-06-30 05:00:21.091135","type":"intermediary","vlan":1},{"dpid":"00:00:00:00:00:00:00:01","port":2,"time":"2023-06-30 05:00:21.091198","type":"incomplete","vlan":1,"out":null}]}
root@be420d5eb226:/#

As you can see above, sdntrace_cp ended up reaching SW01 (I suppose due to the EPL EVC).

italovalcy commented 1 year ago

the impact of this is that everytime kytos is reloaded, the EVCs on this situation get redeployed unnecessarily

viniarck commented 1 year ago

@italovalcy, good catch.

sdntrace_cp relies on flow_manager's GET /v2/stored_flows enpoint which will return the flows sorted by descending priority as ascending last updated_at, so following the expected lookup order, looks like sdntrace_cp didn't match correctly when looking up the match from s3 with dl_vlan=1,in_port=3, it ended up matching the less specific entry (and lower priority) of the EPL:

        {
            "flow": {
                "actions": [
                    {
                        "action_type": "pop_vlan"
                    },
                    {
                        "action_type": "output",
                        "port": 1
                    }
                ],
                "cookie": 12301880484014349642,
                "hard_timeout": 0,
                "idle_timeout": 0,
                "match": {
                    "dl_vlan": 1,
                    "in_port": 3
                },
                "owner": "mef_eline",
                "priority": 20000,
                "table_group": "evpl",
                "table_id": 0
            },
            "flow_id": "f894d937ae8fc116f280bd31a8e92ade",
            "id": "248d71faee75dedbc72d460b583de367",
            "inserted_at": "2023-06-30T13:43:11.477000",
            "state": "installed",
            "switch": "00:00:00:00:00:00:00:02",
            "updated_at": "2023-06-30T13:43:11.486000"
        },
        ....
        {
            "flow": {
                "actions": [
                    {
                        "action_type": "output",
                        "port": 2
                    }
                ],
                "cookie": 12257353072959234369,
                "hard_timeout": 0,
                "idle_timeout": 0,
                "match": {
                    "in_port": 3
                },
                "owner": "mef_eline",
                "priority": 10000,
                "table_group": "epl",
                "table_id": 0
            },
            "flow_id": "34aadc32cdc5b4ca13c69545a3c46ef3",
            "id": "a23cefee5070460f2acc61f19aaf98db",
            "inserted_at": "2023-06-30T13:42:49.706000",
            "state": "installed",
            "switch": "00:00:00:00:00:00:00:02",
            "updated_at": "2023-06-30T13:42:49.718000"
        },

The installed flows on s2 the priority are coherent as expected too, here's the two flow entries matching on s2 s2-eth3 (that's relevant for tracing from s4):

 cookie=0xaab90f1f3efc5d4a, duration=357.425s, table=0, n_packets=0, n_bytes=0, send_flow_rem priority=20000,in_port="s2-eth3",dl_vlan=1 actions=pop_vlan,output:"s2-eth1"
  cookie=0xaa1addad78472141, duration=379.195s, table=0, n_packets=134, n_bytes=6292, send_flow_rem priority=10000,in_port="s2-eth3" actions=output:"s2-eth2"
❯ sudo ovs-ofctl -O OpenFlow13 dump-flows s2
 cookie=0xac00000000000002, duration=403.822s, table=0, n_packets=0, n_bytes=0, send_flow_rem priority=50000,dl_src=ee:ee:ee:ee:ee:01 actions=CONTROLLER:65535
 cookie=0xac00000000000002, duration=403.820s, table=0, n_packets=0, n_bytes=0, send_flow_rem priority=50000,dl_src=ee:ee:ee:ee:ee:03 actions=CONTROLLER:65535
 cookie=0xaab90f1f3efc5d4a, duration=357.427s, table=0, n_packets=0, n_bytes=0, send_flow_rem priority=20000,in_port="s2-eth1",dl_vlan=999 actions=set_field:5095->vlan_vid,push_vlan:0x88
a8,set_field:4097->vlan_vid,output:"s2-eth3"
 cookie=0xaab90f1f3efc5d4a, duration=357.425s, table=0, n_packets=0, n_bytes=0, send_flow_rem priority=20000,in_port="s2-eth3",dl_vlan=1 actions=pop_vlan,output:"s2-eth1"
 cookie=0xaa1addad78472141, duration=379.197s, table=0, n_packets=134, n_bytes=6292, send_flow_rem priority=10000,in_port="s2-eth2" actions=output:"s2-eth3"
 cookie=0xaa1addad78472141, duration=379.195s, table=0, n_packets=134, n_bytes=6292, send_flow_rem priority=10000,in_port="s2-eth3" actions=output:"s2-eth2"
 cookie=0xab00000000000002, duration=404.080s, table=0, n_packets=18, n_bytes=756, send_flow_rem priority=1000,dl_vlan=3799,dl_type=0x88cc actions=CONTROLLER:65535

Let me also attach a logical diagram of the topology to facilitate for other future readers too, since I ended up drawing to confirm understading:

20230630_111641

@gretelliz can you help with this one? Italo has tagged it with 2022.3 so, it'll need to be backported (and cherry-picked) too.

Ktmi commented 1 year ago

Something I found. In the trace from s4 port 1, the trace completely skips s2. I think this problem is starting somewhere else that sdntrace_cp.

Looking at the topology, I am getting some weird behavior where the interfaces on Switch 2 have links going to Switch 1 and Switch 3 as expected, however on Switch 3, the interfaces don't have a link to Switch 2, and instead has a link to Switch 1. The same goes for Switch 1 instead having a link to Switch 3. Its as if Switch 2 is being treated as a tunnel. The flow tables matched entirely correctly, its just that the topology used to find the next switch is wrong here.

Ktmi commented 1 year ago

I discussed this issue a bit with @viniarck and he says its occuring because the of_lldp packets are getting routed through the epl, resulting in kytos assuming a link exists between S1 and S3. According to him its related to this issue kytos-ng/of_lldp#85.

viniarck commented 1 year ago

I discussed this issue a bit with @viniarck and he says its occuring because the of_lldp packets are getting routed through the epl, resulting in kytos assuming a link exists between S1 and S3. According to him its related to this issue kytos-ng/of_lldp#85.

@Ktmi, indeed. If LLDP flow entry priority is always higher than the EPL, then this issue doesn't happen. Still though, I was a bit puzzled how the incorrect trace was happening, based on your last comment, I ended up debugging it too, and I found out that on topology historically, whenever a link was discovered we have a sanity check/setter that ensures that an interface only belongs to a single link, which makes a lot of sense. With the EPL, what ends up happening is that s3-eth2 will end up with this extra link in the same interface, consequently, whenever a link discovery takes place, it'll keep cycling, so the actual trace output isn't deterministic. For instance, here's the interface having the link updated, which sdntrace uses to find the next switch:

❯ curl -s http://0.0.0.0:8181/api/kytos/topology/v3/interfaces | jq '.interfaces["00:00:00:00:00:00:00:03:2"].link'
"7437dc2f30a534e10b85501fd36dcc2022e05858c86b48ee188a121eb1090b37"
❯ curl -s http://0.0.0.0:8181/api/kytos/topology/v3/interfaces | jq '.interfaces["00:00:00:00:00:00:00:03:2"].link'
"4d42dc0852278accac7d9df15418f6d921db160b13d674029a87cef1b5f67f30"

With that said, we'll need to prioritize https://github.com/kytos-ng/of_lldp/issues/85 instead for 2022.3 if @italovalcy wants this fixed, I'd also expect liveness to still work for this case, since LLDP won't be encapsulated, and indeed having a rediscovery via an EPL is like a recursive definition that was never initially supported, which would also imply that we would never have liveness going through a EPL based on the current matches as they are.

We need to confirm with @italovalcy and @jab1982 if they have any other use case that we need to be aware of. Other than that, if we can just set the of_lldp priority higher and don't have any need for liveness connectivy via an EPL (which theoretically that's what sdntrace_cp and sdntrace are for too - ensuring end-to-end traceability), then we'd be good to proceed I'd say. @Ktmi could you confirm with them and help out with the linked issue? Initially Aldo was assigned to it, but since he's taking care of other bugs.

Thanks for helping out with this issue and analyzing.

italovalcy commented 1 year ago

Hi @viniarck and @Ktmi, yes, it looks like the OF_LLDP flows priority is the source cause for this issue with 2022.3. After changing the of_lldp settings to use priority=50000 (similar to coloring), this issue was not observed anymore.

However, after trying the same setup in the master branch, it looks like we have a regrassion for EPL traces. At some point it was introduced the type=incomplete into sdntrace, as well as mef_eline trace_invalid check also validates for the type. Furthermore, we changed a bit the behavior of sdntrace in some scenarios, replacing the break statement with a do_trace=False (which ultimately impacts on the last result being added to the trace result and wrong trace processing)

https://github.com/kytos-ng/sdntrace_cp/blob/2022.3.1/main.py#L120-L122

https://github.com/kytos-ng/sdntrace_cp/blob/master/main.py#L128-L130

and:

https://github.com/kytos-ng/mef_eline/blob/2022.3.0/models/evc.py#L1157-L1168

https://github.com/kytos-ng/mef_eline/blob/master/models/evc.py#L1222-L1225

I don't believe we should return the trace as incomplete from sdntrace_cp perspective, because it wouldn't know the exact semantics of a particular flow at the point. Only the consumer of a trace will be able to answer if the trace is complete or not. I would say that unless an error occurs during the trace processing routine, the traces will finish in the type=last condition.

Please let me know what do you guys think.

viniarck commented 1 year ago

I would say that unless an error occurs during the trace processing routine

@italovalcy, the "incomplete" is capturing the semantic when there isn't a flow match, so in that case sdntrace_cp is trying to facilitate for the callers to let them know that it already didn't match anything at certain point. Here's the summary of the new states in the changelog and what they mean:

Furthermore, we changed a bit the behavior of sdntrace in some scenarios, replacing the break statement with a do_trace=False (which ultimately impacts on the last result being added to the trace result and wrong trace processing)

Italo, thanks for looking into it, please let us know if you've found out a case where it result in a wrong trace or if you encountered any other inconsistencies, just so it can be fixed. cc'ing @Ktmi (since he's already been helping with this issue) and @gretelliz to the discussion. @Ktmi, on yesterday's meeting confirmed he'll help out with this patch, so thanks for confirming the 50k flow priority.

Ktmi commented 1 year ago

Here is the PR for raising the default flow priority.

viniarck commented 1 year ago

Here is the PR for raising the default flow priority.

Great, @Ktmi, if you could also send a backport targeting base/2022.3.2 too and bumping 2022.3.2, that'd be great.

italovalcy commented 1 year ago

Hi Vinicius,

@italovalcy, the "incomplete" is capturing the semantic when there isn't a flow match, so in that case sdntrace_cp is trying to facilitate for the callers to let them know that it already didn't match anything at certain point. Here's the summary of the new states in the changelog and what they mean:

  • Update tracepath to support two new trace types: loop and incomplete. Both represent a failure, and incomplete type also replaces the empty list that was returned when a flow or port isn't matched. Other three types are starting for the first trace_step, intermediary for subsequent trace_steps (previously trace), and last for the last trace_step representing successful (terminating) trace.

Furthermore, we changed a bit the behavior of sdntrace in some scenarios, replacing the break statement with a do_trace=False (which ultimately impacts on the last result being added to the trace result and wrong trace processing)

Italo, thanks for looking into it, please let us know if you've found out a case where it result in a wrong trace or if you encountered any other inconsistencies, just so it can be fixed. cc'ing @Ktmi (since he's already been helping with this issue) and @gretelliz to the discussion. @Ktmi, on yesterday's meeting confirmed he'll help out with this patch, so thanks for confirming the 50k flow priority.

Yes, both cases mentioned above I believe are related to the trace error that we would see using latest Kytos version (after apply the Raise priority PR):

  1. Run kytos from master branch (apply the PR above) and Mininet with linear 4
  2. Create a few EVCs:
    curl -H 'Content-type: application/json' -X POST http://127.0.0.1:8181/api/kytos/mef_eline/v2/evc/ -d '{"name": "evc-vlan-999", "dynamic_backup_path": true,  "uni_a": {"tag": {"value": 999, "tag_type": 1}, "interface_id": "00:00:00:00:00:00:00:02:1"}, "uni_z": {"tag": {"value": 999, "tag_type": 1}, "interface_id": "00:00:00:00:00:00:00:04:1"}}'
    curl -H 'Content-type: application/json' -X POST http://127.0.0.1:8181/api/kytos/mef_eline/v2/evc/ -d '{"name": "pw_s2", "dynamic_backup_path": true, "uni_a": {"interface_id": "00:00:00:00:00:00:00:02:1"}, "uni_z": { "interface_id": "00:00:00:00:00:00:00:01:1"}}'
    curl -H 'Content-type: application/json' -X POST http://127.0.0.1:8181/api/kytos/mef_eline/v2/evc/ -d '{"name": "pw_s3", "dynamic_backup_path": true, "uni_a": {"interface_id": "00:00:00:00:00:00:00:03:2"}, "uni_z": { "interface_id": "00:00:00:00:00:00:00:03:3"}}'
  3. Run the sdntrace for the 3rd EVC:
    
    kytos $> controller.napps[('kytos', 'mef_eline')].circuits
    Out[4]:
    {'18ee43eb56af46': EVC(18ee43eb56af46, evc-vlan-999),
    '551f1386f9054c': EVC(551f1386f9054c, pw_s2),
    '7e9d09004fe745': EVC(7e9d09004fe745, pw_s3)}
    kytos $> cid = '7e9d09004fe745'

kytos $> from napps.kytos.mef_eline.models import EVCDeploy

kytos $> EVCDeploy.check_list_traces([controller.napps[('kytos', 'mef_eline')].circuits[cid]]) 2023-07-19 18:10:01,819 - INFO [uvicorn.access] (MainThread) 127.0.0.1:34720 - "GET /api/kytos/flow_manager/v2/stored_flows/?state=installed HTTP/1.1" 200 2023-07-19 18:10:01,825 - INFO [uvicorn.access] (MainThread) 127.0.0.1:34706 - "PUT /api/amlight/sdntrace_cp/v1/traces HTTP/1.1" 200 2023-07-19 18:10:01,828 - WARNING [kytos.napps.kytos/mef_eline] (ThreadPoolExecutor-1_0) Invalid trace from trace_a: [{'dpid': '00:00:00:00:00:00:00:03', 'port': 2, 'time': '2023-07-19 18:10:01.824114', 'type': 'starting'}, {'dpid': '00:00:00:00:00:00:00:04', 'port': 2, 'time': '2023-07-19 18:10:01.824221', 'type': 'incomplete', 'out': None}] Out[3]: {'7e9d09004fe745': False}



In the case above, both changes end up impacting the trace: 1) the break statement removed made an impact by having an additional step on the trace result and ultimately the correct out ports were not presented; 2) the "incomplete" along with mef_eline's validation  made Mef Eline believe the trace was invalid. I mentioned the semantic, because I believe only mef_eline can interpret the result and say that it makes sense or not (it is incomplete or not). My point is, having no flow to match an specific traffic should be okay, it will depends entirely on who requested the flows and what that person intents to do. Since sdntrace does not have the context, it might be wrong to say that the trace was incomplete (the changelog even says "Both represent a failure"). However, if we decide to keep using that state, mef_eline should ignore that type.
italovalcy commented 1 year ago

@viniarck and @Ktmi even after applying the priority change to production, we still have a lot of traces that failed incorrectly. As far as I could trace, all failures were related to this PR on flow_manager that was not backported to 2022.3: https://github.com/kytos-ng/flow_manager/pull/140

Not having the sorted flows to be processed by sdntrace can impact processing the EPL flow instead of the EVPL, then sdntrace will report a failed result.

viniarck commented 1 year ago

@italovalcy, appreciated you looking into this and your feedback. I just had a chance to run this linear,4 scenario. Here's a quick summary:

2023.1 (master):

  1. Indeed "type": "incomplete", changed the behavior in the v1, if a control plane lookup is performed, it's also included, which has this effect of additional compared to 2022.3. We need to make a decision, if you think it's clearer to never included a non matched lookup then we can adapt accordingly and follow 2022.3 behavior. To me, looking only from sdntrace_cp perspective (without any clients context), performing a control-plane lookup, the explicit "incomplete" (maybe the word is too strong, could be "no_match" or something), it still makes sense, but like you said, mef_eline was also relying on the fact that the expected trace path to be valid had to be + 1 compared to the current evc current path, which is a great property to still have, to maybe getting rid entirely of "incomplete" might be worth it, if we don't actually see much value of having this explicit "no_match". Let me know what you think and what you prefer, indeed the safest route would be to get rid of this, since essentially "incomplete" (or "no_match" is just convenience to be explicit about a control plane lookup, but for sure we can officialize that a no match will never show up like it used to).
  2. Let's update the changelog regarding "type" and make sure whatever states are left are clearly explained what they mean from the control plane perspective regardless of a client point of view too, and if we remove "incomplete", then also remove there. Same thing for the openapi.yml spec.
  3. Let's include e2e test to cover this scenario, on our e2e it doesn't like we have EVCs terminating on NNIs, so let's make sure it won't regress again, and also let's make sure it's representative of how it's used in prod.
  4. Review on mef_eline if we'll still be able to partly rely on "last" or not (see question below regarding "last")

2022.3

  1. Needs https://github.com/kytos-ng/flow_manager/pull/140 backported on flow_manager (this is the most urgent part that's impacting prod)

  2. Question: to try to keep "last" meaning indeed the last switch which had an outgoing port, should the "type" here be "last" too instead of "starting"? It wouldn't impact on mef_eline since its relying on the length of the result list and then comparing the endpoints, but it'd be great to keep it consistent to avoid future misunderstandings (and also to explain in the changelog and docs what "last" will actually represent):

❯ curl -H 'Content-type: application/json' -X PUT http://127.0.0.1:8181/api/amlight/sdntrace_cp/trace -d '{"trace": {"switch": {"dpid": "00:00:00:00:00:00:00:03", "in_port": 2}}}' | jq
{
  "result": [
    {
      "dpid": "00:00:00:00:00:00:00:03",
      "out": {
        "port": 3
      },
      "port": 2,
      "time": "2023-07-20 11:48:00.579265",
      "type": "starting"
    }
  ]
}
gretelliz commented 1 year ago

Hello everyone,

@viniarck, In fact, I think type = starting impacts mef_eline w.r.t. 2022.3 point 2 (Question). Note that we have an early return that checks for type !=last.

Ktmi commented 1 year ago

Found another weird bug with priority that could mess up traces. Having multiple nearly identical flows results in the flows not being properly sorted by priority when getting them through flow_manager with /v2/flows. Here is an example of the results:

{
  "00:00:00:00:00:00:00:01": {
    "flows": [
      {
        "switch": "00:00:00:00:00:00:00:01",
        "table_id": 0,
        "match": {
          "dl_vlan": 3799,
          "dl_type": 35020
        },
        "priority": 50000,
        "idle_timeout": 0,
        "hard_timeout": 0,
        "cookie": 12321848580485677000,
        "id": "2ee616117c8ce044e9f8f1f4f8e266ad",
        "stats": {
          "byte_count": 1806,
          "duration_sec": 211,
          "duration_nsec": 705000000,
          "packet_count": 43
        },
        "cookie_mask": 0,
        "instructions": [
          {
            "instruction_type": "apply_actions",
            "actions": [
              {
                "port": 4294967293,
                "action_type": "output"
              }
            ]
          }
        ]
      },
      {
        "switch": "00:00:00:00:00:00:00:01",
        "table_id": 0,
        "match": {
          "dl_vlan": 3799,
          "dl_type": 35020
        },
        "priority": 50001,
        "idle_timeout": 0,
        "hard_timeout": 0,
        "cookie": 12321848580485677000,
        "id": "64e267f46be47603597f1dce49e70005",
        "stats": {
          "byte_count": 882,
          "duration_sec": 64,
          "duration_nsec": 20000000,
          "packet_count": 21
        },
        "cookie_mask": 0,
        "instructions": [
          {
            "instruction_type": "apply_actions",
            "actions": [
              {
                "port": 4294967293,
                "action_type": "output"
              }
            ]
          }
        ]
      },
      {
        "switch": "00:00:00:00:00:00:00:01",
        "table_id": 0,
        "match": {
          "dl_src": "ee:ee:ee:ee:ee:02"
        },
        "priority": 50000,
        "idle_timeout": 0,
        "hard_timeout": 0,
        "cookie": 12393906174523605000,
        "id": "1def4287eab3ab24897296657de64520",
        "stats": {
          "byte_count": 0,
          "duration_sec": 64,
          "duration_nsec": 59000000,
          "packet_count": 0
        },
        "cookie_mask": 0,
        "instructions": [
          {
            "instruction_type": "apply_actions",
            "actions": [
              {
                "port": 4294967293,
                "action_type": "output"
              }
            ]
          }
        ]
      }
    ]
  }
}
viniarck commented 1 year ago

Found another weird bug with priority that could mess up traces. Having multiple nearly identical flows results in the flows not being properly sorted by priority when getting them through flow_manager with /v2/flows. Here is an example of the results:

@Ktmi GET /v2/flows which are being created from flow stats doesn't ensure sorting, even though the contents might come in order in certain cases (flow OFPMP_FLOW flow stats contents doesn't guarantee ordering by priority). Also, sdntrace_cp relies on GET /v2/stored_flows which provides sorting. FYI, GET /v2/flows due to being backed by flow stats contents, it's also susceptible to have temporarily different data based on when flow stats are received, which is why also clients and sdntrace_cp rely on stored_flows instead to simplify (and avoid potential false positive data that used to cause many difficulties for clients).

If you need to immediately know which flows are actually installed on the switch you can use ovs-ofctl -O OpenFlow13 dump-flows <switch_name> (if you're running an OvS instance), but eventually GET /v2/flows will also converge, and GET /v2/stored_flows is the source of the truth as far as flows management is concerned. GET /v2/stored_flows also provide a state: "installed" when a flow has actually been confirmed

viniarck commented 1 year ago

In the future, we could provide sorting for GET /v2/flows in the same way that we do for GET /v2/stored_flows, we could map an enhancement priority_low for this, and then also rely on duration_sec to sort ascending when it was last changed.