SEL-Columbia / modelrunner

Framework for running models as long running jobs via the web
1 stars 2 forks source link

Sunset workers #55

Open chrisnatali opened 8 years ago

chrisnatali commented 8 years ago

When a worker's models are out of date, it should be brought off-line in favor of new workers with up-to-date models.

Problems arise when jobs are running on workers scheduled to be sunsetted.

Does disabling access to Redis via removing ufw 'allow' rule prevent worker from taking new jobs off the queue? If so, this may be a way of taking out of date servers off line...to be 'reaped' later.

chrisnatali commented 8 years ago

The following redis commands may help: CLIENT LIST: List of connected clients CLIENT KILL: Kill specified client

I noticed that upon terminating workers, there were still 'zombie' connections from workers that needed to be killed.

chrisnatali commented 8 years ago

With the addition of the 'STOP_PROCESSING_QUEUE' command (part of the 0.5 release), this is partially addressed. By issuing 'STOP_PROCESSING_QUEUE' to a 'RUNNING' worker, it ensures that the worker will not pickup any new jobs while allowing it to complete the current job. Simplifying Sunsetting.

Leaving this open because this process still needs to be formalized.