nodules / luster-guard

Restart workers on events from file system.
MIT License
3 stars 2 forks source link

Hanging on restart with babel-node #11

Closed dfilatov closed 8 years ago

dfilatov commented 8 years ago

I'm trying to use babel-node + luster + luster-guard via babel-node -- luster luster.conf.js. Unfortunately luster is hanging while restarting with message restarting workers....

narqo commented 8 years ago

It works for me with this luster.conf.json:

var path = require('path');
module.exports = {
    app: path.resolve(__dirname, '../../server/boot.js'),
    workers: 2,
    control: {
        forkTimeout: 5000,
        stopTimeout: 7000,
        exitThreshold: 15000,
        allowedSequentialDeaths: 10
    },
    server: {
        port: 20080
    },
    extensions: {
        'luster-guard': {
            patterns: [
                'src/configs/current/*.js',
                'src/server/**/*.js',
            ]
        }
    },
};

I believe this control.stopTimeout option (see https://github.com/nodules/luster#annotated-example-of-configuration) is the key factor. The worker spawned by babel-node or with babel-register hangs during disconnect-ion, so the process with updated code doesn't start.

Another problem is that restarting process under the babel-node is very slow.

dfilatov commented 8 years ago

@narqo Thanks! It works suitable with stopTimeout option.

dfilatov commented 8 years ago

If another restart occurs until previous one isn't finished I've got an error:

events.js:85
     throw er; // Unhandled 'error' event
           ^
Error: channel closed
   at process.target.send (child_process.js:414:26)
   at sendHelper (cluster.js:676:8)
   at send (cluster.js:663:5)
   at Server.<anonymous> (cluster.js:570:7)
   at Server.g (events.js:199:16)
   at Server.emit (events.js:129:20)
   at net.js:1171:12
   at process._tickDomainCallback (node.js:381:11)
kaero commented 8 years ago

@dfilatov Can you share a minimal reproducible example?

Flackus commented 8 years ago

Can be unstably reproduced with something like that (sleep should be adjusted): x=1; while [ $x -le 10 ]; do printf "\n" >> ./desktop.bundles/common/_common.priv.js; sleep 0.3; x=$(($x+1)); done

You either get the same error as in https://github.com/nodules/luster-guard/issues/11#issuecomment-170884457 or these in debug:

2016-01-25 15:03:15.734 [master] DEBUG FS event <change> on file "desktop.bundles/common/_common.priv.js"
2016-01-25 15:03:15.839 [master] DEBUG WorkerWrapper(1).emit(disconnect)
2016-01-25 15:03:15.840 [master] DEBUG WorkerWrapper(1).emit(state, stopping)
2016-01-25 15:03:15.840 [master] DEBUG Master(0).emit(worker state, [object Object], stopping)
2016-01-25 15:03:15.840 [master] DEBUG Master(0).emit(worker disconnect, [object Object])
2016-01-25 15:03:15.864 [master] DEBUG WorkerWrapper(1).emit(exit, 0, )
2016-01-25 15:03:15.865 [master] DEBUG WorkerWrapper(1).emit(state, stopped)
2016-01-25 15:03:15.865 [master] DEBUG Master(0).emit(worker state, [object Object], stopped)
2016-01-25 15:03:15.866 [master] DEBUG Master(0).emit(worker exit, [object Object], 0, )
2016-01-25 15:03:15.874 [master] DEBUG WorkerWrapper(1).emit(fork)
2016-01-25 15:03:15.874 [master] DEBUG Master(0).emit(worker fork, [object Object])
2016-01-25 15:03:15.875 [master] DEBUG WorkerWrapper(1).emit(state, launching)
2016-01-25 15:03:15.875 [master] DEBUG Master(0).emit(worker state, [object Object], launching)
2016-01-25 15:03:15.925 [master] DEBUG WorkerWrapper(1).emit(online)
2016-01-25 15:03:15.925 [master] DEBUG WorkerWrapper(1).emit(state, running)
2016-01-25 15:03:15.926 [master] DEBUG Master(0).emit(worker state, [object Object], running)
2016-01-25 15:03:15.926 [master] DEBUG Master(0).emit(worker online, [object Object])
2016-01-25 15:03:15.973 [work:1] Worker(12).emit(configured)
2016-01-25 15:03:16.038 [master] DEBUG FS event <change> on file "desktop.bundles/common/_common.priv.js"
2016-01-25 15:03:16.097 [master] DEBUG WorkerWrapper(1).emit(message, [object Object], )
2016-01-25 15:03:16.097 [master] DEBUG Master(0).emit(worker message, [object Object], [object Object], )
2016-01-25 15:03:16.097 [master] DEBUG Master(0).emit(worker configured, [object Object])
2016-01-25 15:03:16.100 [work:1] Worker(1).emit(foreign properties received)
2016-01-25 15:03:16.101 [work:1] Worker(1).emit(extension loaded, luster-guard)
2016-01-25 15:03:16.102 [master] DEBUG WorkerWrapper(1).emit(message, [object Object], )
2016-01-25 15:03:16.102 [master] DEBUG Master(0).emit(worker message, [object Object], [object Object], )
2016-01-25 15:03:16.102 [master] DEBUG Master(0).emit(worker extension loaded, [object Object], luster-guard)
2016-01-25 15:03:16.103 [work:1] DEBUG Worker(1).emit(extension loaded, luster-log-file)
2016-01-25 15:03:16.103 [master] DEBUG WorkerWrapper(1).emit(message, [object Object], )
2016-01-25 15:03:16.104 [master] DEBUG Master(0).emit(worker message, [object Object], [object Object], )
2016-01-25 15:03:16.104 [master] DEBUG Master(0).emit(worker extension loaded, [object Object], luster-log-file)
2016-01-25 15:03:16.104 [work:1] DEBUG Worker(1).emit(extension loaded, luster-log-stat)
2016-01-25 15:03:16.105 [master] DEBUG WorkerWrapper(1).emit(message, [object Object], )
2016-01-25 15:03:16.105 [master] DEBUG Master(0).emit(worker message, [object Object], [object Object], )
2016-01-25 15:03:16.105 [master] DEBUG Master(0).emit(worker extension loaded, [object Object], luster-log-stat)
2016-01-25 15:03:16.106 [work:1] DEBUG Worker(1).emit(initialized)
2016-01-25 15:03:16.106 [master] DEBUG WorkerWrapper(1).emit(message, [object Object], )
2016-01-25 15:03:16.106 [master] DEBUG Master(0).emit(worker message, [object Object], [object Object], )
2016-01-25 15:03:16.107 [master] DEBUG Master(0).emit(worker initialized, [object Object])
2016-01-25 15:03:16.107 [master] DEBUG WorkerWrapper(1).emit(initialized)
2016-01-25 15:03:16.107 [master] DEBUG WorkerWrapper(1).emit(ready)
2016-01-25 15:03:16.107 [master] DEBUG Master(0).emit(worker ready, [object Object])
2016-01-25 15:03:16.107 [master] DEBUG Master(0).emit(restarted)
2016-01-25 15:03:16.339 [master] DEBUG FS event <change> on file "desktop.bundles/common/_common.priv.js"
2016-01-25 15:03:16.400 [work:1] DEBUG Worker(1).emit(worker loaded, /home/flack/www/realty2/app/index.js)
2016-01-25 15:03:16.402 [master] DEBUG WorkerWrapper(1).emit(listening, [object Object])
2016-01-25 15:03:16.403 [master] DEBUG Master(0).emit(worker listening, [object Object], [object Object])
2016-01-25 15:03:16.403 [master] DEBUG WorkerWrapper(1).emit(disconnect)
2016-01-25 15:03:16.403 [master] DEBUG WorkerWrapper(1).emit(state, stopping)
2016-01-25 15:03:16.404 [master] DEBUG Master(0).emit(worker state, [object Object], stopping)
2016-01-25 15:03:16.404 [master] DEBUG Master(0).emit(worker disconnect, [object Object])
2016-01-25 15:03:16.410 [master] DEBUG WorkerWrapper(1).emit(exit, 0, )
2016-01-25 15:03:16.411 [master] DEBUG WorkerWrapper(1).emit(state, stopped)
2016-01-25 15:03:16.411 [master] DEBUG Master(0).emit(worker state, [object Object], stopped)
2016-01-25 15:03:16.411 [master] DEBUG Master(0).emit(worker exit, [object Object], 0, )
2016-01-25 15:03:16.411 [master] DEBUG Master(0).emit(shutdown)

luster config:

module.exports = {
    app: require.resolve('app'),
    workers: 1,
    control: {
        forkTimeout: 5000,
        stopTimeout: 10000,
        exitThreshold: 15000,
        allowedSequentialDeaths: 10
    },
    extensions: {
        'luster-guard': {
            path: '../..',
            debounce: 100,
            interval: 500,
            patterns: [
                'app/**/*.js',
                '*.bundles/**/_*.priv.js'
            ]
        }
    }
};
kaero commented 8 years ago

@Flackus pls, try to set control.allowedSequentialDeaths to Infinity (or very large value), or try lower values for exitThreshold (about 1 second or less).

explanation: it looks like regular shutdown of the master process (by mesaages in the debug output) in the case when all workers are dead. All workers can be marked as dead due to the series of restarts triggered by the luster-guard extension.

Flackus commented 8 years ago

@kaero yep, seems to do the trick.

Flackus commented 8 years ago

@kaero still not sure if it helps with https://github.com/nodules/luster-guard/issues/11#issuecomment-170884457

dfilatov commented 8 years ago

@Flackus @kaero It definitely doesn't help with Error: channel closed

kaero commented 8 years ago

@dfilatov

It definitely doesn't help with Error: channel closed

This error is a side-effect of the sending a message from the master to the early failed worker before the dispatching of the exit event. Please, look for errors before which may cause it. Or it may be thrown due to multiple worker restarts in sequence: master want send a message to a worker, but worker has been killed to restart again.

This concrete error looks like an issue of the luster, but it's not trivial to fix, nor critical, as i think.

kaero commented 8 years ago

@dfilatov btw, is the subject of this issue was fixed or still actual?

kaero commented 8 years ago

ping

dfilatov commented 8 years ago

This subject has fixed

dfilatov commented 8 years ago

But what about given error? Are there any plans to fix it in luster?

kaero commented 8 years ago

@dfilatov yep, see linked issue