10gen / mongo-orchestration

Apache License 2.0
7 stars 11 forks source link

Improve error reporting #198

Closed bjori closed 8 years ago

bjori commented 9 years ago

The mongoc and c++ drivers are have relatively frequent issues with mongo-orchestration which we are unable to debug.

See for example: https://evergreen.mongodb.com/task_log_raw/mongo_c_driver_ubuntu_1204_64_integration_test_2.6_sharded_7c5dbed32ef4ca2fe9960193e77cda31a9faa239_15_09_17_20_32_42/0?type=T#L57

Improving the error reporting would improve the user experience dramatically, not to mention debugging and understanding the issue. Currently MO just dumps a stacktrace in a oneliner which doesn't help anyone. After scouring the long line for anything useful we see there was an exception thrown: raise TimeoutError(errno.ETIMEDOUT, message)\n", "TimeoutError: 110\n"

What timed out? And why?

behackett commented 9 years ago

It looks like this was raised here:

https://github.com/10gen/mongo-orchestration/blob/736f48aef5dca7006937a2fd502b543c132389a2/mongo_orchestration/process.py#L216-L223

It looks like mongo-ochestration provides all the needed detail in debug logging. evergreen needs to spawn MO with debug level logging so we can track down the issues. In this case it looks like mongod/s didn't respond within 180 seconds.

llvtt commented 9 years ago

All log output goes to a log file, however. You'll need access to the MO log file to see more detail about the Exception. FWIW, a TimeoutError means that a mongo[ds] process failed to start in a reasonable amount of time. The reason why this happened can sometimes be learned from the stderr output of the process, and sometimes it's only visible in the MongoDB log file, so there's a limit to how helpful MO can be here. However, I think MO could at least forward mongo[ds]'s stderr output to the log and stderr.

On Wed, Sep 23, 2015 at 9:57 AM, Bernie Hackett notifications@github.com wrote:

It looks like this was raised here:

https://github.com/10gen/mongo-orchestration/blob/736f48aef5dca7006937a2fd502b543c132389a2/mongo_orchestration/process.py#L216-L223

It looks like mongo-ochestration provides all the needed detail in debug logging. evergreen needs to spawn MO with debug level logging so we can track down the issues.

— Reply to this email directly or view it on GitHub https://github.com/10gen/mongo-orchestration/issues/198#issuecomment-142662044 .

bjori commented 9 years ago

MO spawns the mongo[d|s], and knows where it logs to, right? I think its easier to include that log in the error then it is for arbitrary process to do the detective work which monogod it was and where the logs are, since - like in this case - MO was asked to bring up a full sharded cluster.

llvtt commented 8 years ago

I think this was resolved as part of #199 (cat the failed server log to the MO log). Feel free to reopen if there's more to be done here.