dorbel-tech / dorbel-shared

dorbel shared dependencies used in dorbel-app
0 stars 1 forks source link

Investigate why process wasn't restarted which caused production downtime #64

Closed virtser closed 7 years ago

virtser commented 7 years ago

You can review the change related to dotenv I did in shared: https://github.com/dorbel-tech/dorbel-shared/pull/63/files

I check env vars in the container of the faulty instance of front-gateway:

bash-4.3# env | grep NODE_ENV
NODE_ENV=production

Error which cause the downtime:

{
  "application": "front-gateway",
  "name": "serverRunner.js",
  "hostname": "26c541879122",
  "pid": 36,
  "level": 50,
  "err": {
    "message": "location is not defined",
    "name": "ReferenceError",
    "stack": "ReferenceError: location is not defined\n    at AuthStore.logout (/home/nodejs/app/src/stores/AuthStore.js:62:5)\n    at Timeout.<anonymous> (/home/nodejs/app/src/stores/AuthStore.js:40:50)\n    at Timeout.wrappedCallback (/home/nodejs/app/node_modules/newrelic/lib/transaction/tracer/index.js:360:23)\n    at Timeout.wrapped (/home/nodejs/app/node_modules/newrelic/lib/transaction/tracer/index.js:183:28)\n    at Timeout.wrappedCallback (/home/nodejs/app/node_modules/newrelic/lib/transaction/tracer/index.js:451:66)\n    at Timeout.wrapped (/home/nodejs/app/node_modules/newrelic/lib/transaction/tracer/index.js:183:28)\n    at ontimeout (timers.js:365:14)\n    at tryOnTimeout (timers.js:237:5)\n    at Timer.listOnTimeout (timers.js:207:5)"
  },
  "msg": "Uncaught exception in process, exiting",
  "time": "2017-04-02T18:17:07.039Z",
  "v": 0
}
lxngxr commented 7 years ago

Successfully crashed the server on test environment. However, it recovered in ~30secs.

My changes in order the create the crash are available in https://github.com/dorbel-tech/dorbel-app/tree/temp/test-throng + NODE_ENV var was changed in aws front-gateway-test to staging

avnersorek commented 7 years ago

I could not recreate the not-restarting The original failing version was deployed to http://front-gateway-prod-green.eu-west-1.elasticbeanstalk.com and I made i couple of logins, let's see what happens when the problem will happen again in a couple of hours.

avnersorek commented 7 years ago

The prod-green did fail after a couple of hours due to the location problem but restarted on it's own. image