10gen / mongo-orchestration

Apache License 2.0
7 stars 11 forks source link

Detect when mongod or mongos quits early #209

Closed ajdavis closed 1 year ago

ajdavis commented 8 years ago

If you pass an unrecognized config option to a mongod or mongos, or if disk is already full when they start, or various other misconfigurations, the mongod or mongos processes that Orchestration starts will exit quickly with nonzero code.

Orchestration should detect this, clean up, and return a useful error to the caller, rather than waiting 120 seconds to decide that starting the cluster has generically "timed out".

ShaneHarvey commented 1 year ago

Note we should also address the stdout/stderr issues described in https://github.com/10gen/mongo-orchestration/issues/272 when fixing this issue.

ShaneHarvey commented 1 year ago

This issue is fixed. The current behavior is to raise an error immediately. For example when passing the "invalidOption" option:

$ python3 ~/launch.py repl single
{'members': [{'procParams': {'bind_ip': '127.0.0.1,::1',
                             'invalidOption': True,
                             'ipv6': True,
                             'logappend': True,
                             'port': 27017,
                             'setParameter': {'enableTestCommands': 1}}}]}
Traceback (most recent call last):
  File "/Users/shane/launch.py", line 333, in <module>
    cluster.start()
  File "/Users/shane/launch.py", line 209, in start
    return self._init_from_response(self._make_post_request())
  File "/Users/shane/launch.py", line 99, in _make_post_request
    raise RuntimeError(
RuntimeError: Error sending POST to cluster: Traceback (most recent call last):
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/servers.py", line 338, in start
    self.proc, self.hostname = process.mprocess(
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/process.py", line 233, in mprocess
    if wait_for(proc, port, timeout):
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/process.py", line 159, in wait_for
    raise OSError("Process started, but died immediately")
OSError: Process started, but died immediately

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/apps/__init__.py", line 66, in wrap
    return f(*arg, **kwd)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/apps/replica_sets.py", line 80, in rs_create
    result = _rs_create(data)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/apps/replica_sets.py", line 37, in _rs_create
    rs_id = ReplicaSets().create(params)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/replica_sets.py", line 648, in create
    repl = ReplicaSet(rs_params)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/replica_sets.py", line 75, in __init__
    config_members = [f.result() for f in futures]
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/replica_sets.py", line 75, in <listcomp>
    config_members = [f.result() for f in futures]
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/replica_sets.py", line 325, in member_create
    server_id = self._servers.create(
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/servers.py", line 522, in create
    server.start(timeout)
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/servers.py", line 364, in start
    reraise(TimeoutError,
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/compat.py", line 21, in reraise
    raise exctype(str(value)).with_traceback(trace)
mongo_orchestration.errors.TimeoutError: Could not start Server. Please check the mongo-orchestration log in /Users/shane/mongo-orchestration/server.log for more details.

And the server log contains the error:

2023-03-14 14:33:11,910 [DEBUG] mongo_orchestration.apps:63 - rs_create((), {})
2023-03-14 14:33:11,910 [DEBUG] mongo_orchestration.apps.replica_sets:77 - rs_create()
2023-03-14 14:33:11,910 [DEBUG] mongo_orchestration.servers:167 - Server.__init__(mongod, {'replSet': 'eb8cc0f8-061f-48a1-b23d-1ce0a31c2a8f', 'logappend': True, 'ipv6': True, 'bind_ip': '127.0.0.1,::1', 'invalidOption': True, 'setParameter': {'enableTestCommands': 1}, 'port': 27017}, {}, None, None, None)
2023-03-14 14:33:11,910 [DEBUG] mongo_orchestration.servers:66 - Creating log file for mongod: /var/folders/88/rvpt81v95lxcw1t9_7089vkr0000gp/T/mongod-kjr_sj_c/mongod.log
2023-03-14 14:33:11,910 [DEBUG] mongo_orchestration.servers:223 - ('mongod', '--version')
2023-03-14 14:33:12,085 [DEBUG] mongo_orchestration.process:203 - mprocess(name='mongod', config_path='/var/folders/88/rvpt81v95lxcw1t9_7089vkr0000gp/T/mongo-59ur1_s0', port=27017, timeout=300)
2023-03-14 14:33:12,086 [DEBUG] mongo_orchestration.process:217 - execute process: mongod --config /var/folders/88/rvpt81v95lxcw1t9_7089vkr0000gp/T/mongo-59ur1_s0 --port 27017
2023-03-14 14:33:12,088 [DEBUG] mongo_orchestration.process:153 - wait for 27017
Error parsing INI config file: unrecognised option 'invalidOption'
try 'mongod --help' for more information
2023-03-14 14:33:12,304 [DEBUG] mongo_orchestration.process:158 - process is not alive
2023-03-14 14:33:12,304 [ERROR] mongo_orchestration.servers:362 - Could not start Server
Traceback (most recent call last):
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/servers.py", line 338, in start
    self.proc, self.hostname = process.mprocess(
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/process.py", line 233, in mprocess
    if wait_for(proc, port, timeout):
  File "/Users/shane/git/mongo-orchestration/mongo_orchestration/process.py", line 159, in wait_for
    raise OSError("Process started, but died immediately")
OSError: Process started, but died immediately
....
ShaneHarvey commented 1 year ago

This was fixed by #280