kytos-ng / kytos

Kytos SDN Platform. Kytos is designed to be easy to install, use, develop and share Network Apps (NApps).
https://kytos-ng.github.io/
MIT License
3 stars 7 forks source link

[2022.3.3] feat: set default ``maxTimeMS`` when creating an index; unhandled exc on `setup()` will exit `kytosd` #416

Closed viniarck closed 1 year ago

viniarck commented 1 year ago

This is a backport from PR #414 to 2022.3.3.

Local tests

I simulated an upgrade from 2022.3.1 to master (future 2023.2), without running the DB scripts, and with two nodes down out of five (still has majority), same scenario that @italovalcy has reported, and it retried and eventually exited after 1m30 more or less (30 secs x 3 retries), notice that on console you can see the reason of the error {'code': 50, 'codeName': 'MaxTimeMSExpired', 'errmsg ': 'operation exceeded time limit', 'errInfo': {'writeConcern': {'w': 'majority', 'wtimeout': 0, 'provenance': 'clientSupplied'}}} (it's a bit lower level-ish from our dependency pymongo but this was also part of intention, just so we don't hide these details if any future issue needs to be analyzed or fixed):

2023-09-26 15:42:11,176 - INFO [kytos.core.controller] (MainThread) Loading NApp kytos/flow_manager
2023-09-26 15:42:41,350 - WARNING [kytos.core.retry] (MainThread) Retry #1 for bootstrap_index, args: (<class 'kytos.core.db.Mongo'>, 'flows', [('flow.priority', -1)]), kwargs: {}, secon
ds since start: 30.04
2023-09-26 15:43:12,073 - WARNING [kytos.core.retry] (MainThread) Retry #2 for bootstrap_index, args: (<class 'kytos.core.db.Mongo'>, 'flows', [('flow.priority', -1)]), kwargs: {}, secon
ds since start: 60.76
Kytos couldn't start because of KytosNAppSetupException: NApp kytos/flow_manager exception operation exceeded time limit, full error: {'code': 50, 'codeName': 'MaxTimeMSExpired', 'errmsg
': 'operation exceeded time limit', 'errInfo': {'writeConcern': {'w': 'majority', 'wtimeout': 0, 'provenance': 'clientSupplied'}}}  Traceback (most recent call last):
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 863, in load_napp
    napp = napp_module.Main(controller=self)
  File "/home/viniarck/repos/kytos/kytos/core/napps/base.py", line 194, in __init__
    self.setup()
  File "/home/viniarck/repos/napps/napps/kytos/flow_manager/main.py", line 88, in setup
    self.flow_controller.bootstrap_indexes()
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 324, in wrapped_f
    return self(f, *args, **kw)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 349, in iter
    return fut.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/viniarck/repos/kytos/kytos/core/retry.py", line 25, in decorated
    return func(*args, **kwargs)
  File "/home/viniarck/repos/napps/napps/kytos/flow_manager/controllers/__init__.py", line 73, in bootstrap_indexes
    if self.mongo.bootstrap_index(collection, keys, **kwargs):
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 324, in wrapped_f
    return self(f, *args, **kw)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 360, in iter
    raise retry_exc.reraise()
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 193, in reraise
    raise self.last_attempt.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/viniarck/repos/kytos/kytos/core/db.py", line 126, in bootstrap_index
    return db[collection].create_index(keys, **kwargs)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/collection.py", line 2023, in create_index
    return self.__create_indexes([index], session, **cmd_options)[0]
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/collection.py", line 1900, in __create_indexes
    self._command(
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/collection.py", line 272, in _command
    return sock_info.command(
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/pool.py", line 743, in command
    return command(
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/network.py", line 160, in command
    helpers._check_command_response(
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/helpers.py", line 137, in _check_command_response
    _raise_write_concern_error(_error)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/pymongo/helpers.py", line 196, in _raise_write_concern_error
    raise WriteConcernError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteConcernError: operation exceeded time limit, full error: {'code': 50, 'codeName': 'MaxTimeMSExpired', 'errmsg': 'operation exceeded time limit', 'errInfo': {'writeCon
cern': {'w': 'majority', 'wtimeout': 0, 'provenance': 'clientSupplied'}}}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 265, in start
    self.start_controller()
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 359, in start_controller
    self.load_napps()
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 898, in load_napps
    self.load_napp(napp.username, napp.name)
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 866, in load_napp
    raise KytosNAppSetupException(msg) from exc
kytos.core.exceptions.KytosNAppSetupException: KytosNAppSetupException: NApp kytos/flow_manager exception operation exceeded time limit, full error: {'code': 50, 'codeName': 'MaxTimeMSEx
pired', 'errmsg': 'operation exceeded time limit', 'errInfo': {'writeConcern': {'w': 'majority', 'wtimeout': 0, 'provenance': 'clientSupplied'}}} 

2023-09-26 15:43:42,317 - ERROR [kytos.core.controller] (MainThread) 1
2023-09-26 15:43:42,317 - INFO [kytos.core.controller] (MainThread) Shutting down Kytos...
Task was destroyed but it is pending!
task: <Task pending name='Task-1' coro=<start_shell_async() running at /home/viniarck/repos/kytos/kytos/core/kytosd.py:120>>
sys:1: RuntimeWarning: coroutine 'start_shell_async' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

~/repos/kytos feat/load_napps_or_shutdown*  1m 32s
❯ 
viniarck commented 1 year ago

btw, thanks @italovalcy, the reproduction steps of the original issue were super helpful.