Closed viniarck closed 2 years ago
@viniarck thank you very much for submitting this PR. It looks good to me.
I've executed the tests mentioned in issue kytos-ng/kytos#224 using a lab with 5 Noviflow switches, and I was able to reproduce the same issue observed before. At that point, I used the hello interval of 15s in the Noviflow switches and executed the same stress test as Vinicius, except that I used 150 req/sec for 60 seconds. Without this PR, the switches started disconnecting a lot during the tests. Even after the stress test finished, the switches still kept disconnecting. Once I applied the changes from this PR, only 4 disconnections were observed (probably due to the problem Vinicius mentioned in the Flask REST endpoints processing overhead).
I've also executed the end-to-end tests using this branch, and all tests passed without surprises:
166 passed, 19 xfailed, 5 xpassed, 560 warnings in 9988.43s (2:46:28)
Finally, I tested the creation of 800 flows on each one of the five switches and left the consistency check routine run for 2 hours without surprises (no false positives - invalid inconsistencies triggered - and no false negatives - alien and missing flows were correctly detected). I will keep the test running for a longer period, and if something happens, it will be reported as an issue.
@italovalcy much appreciated, great to hear that it did well under these circumstances. Yes, as the core move towards more asynchronous primitives it should improve, let's see how that will evolve with Flask 2.0 and re-run these tests in the future too. Sure thing, looking forward to hearing more about this long running results, it's great to also have them.
I'll go ahead and merge this PR soon then.
Fixes #37, https://github.com/kytos-ng/kytos/issues/224, and https://github.com/kytos-ng/kytos/issues/172
Release notes
async
methodsasync
method in linekytos/of_core.flow_stats.received
to also include the replied flowsmsg_in
andmsg_out
now have priority based on their control plane importance to avoid starvationThis PR implement the changes that on https://github.com/kytos-ng/kytos/issues/172 solution a):
Results
asyncio.Task
is lighter weight and explicit context switching contributed to less time spend waiting for locks then it contributed to these gains.However, notice that for high rates of API requests https://github.com/kytos-ng/kytos/issues/225 it can lead to instability, which can also impact on the queues so even with the priority it can suffer a bit if users go over the requests limit to that point, so this part hasn't been fixed but as soon as we have Flask 2.0 and make more use of async request handlers it should overall improve with that part too (gevent is also an option, but asyncio is more aligned with the platform architecture/goals and better supported upstream, so when the time comes again this will be explored one more time), also adding some rate limiting with Flask should also help a bit to avoid getting to that point.