Closed cbun closed 10 years ago
Cool. Please add a comment when this gets deployed on dev.
What's the unit for running_job_freq=43
?
If the user job limit is reached, the idea is the client code will print the HTTPError(403, "User Job limit reached") it should be able to catch?
The unit is seconds. I just made it some arbitrary number that's larger than the compute's heartbeat (29). The client should catch, but this isn't implemented yet.
Ok. I think the client already catches cherrypy HTTP exceptions with the message.
On Aug 12, 2014, at 10:06 PM, Christopher Bun notifications@github.com wrote:
The unit is seconds. I just made it some arbitrary number that's larger than the compute's heartbeat (29). The client should catch, but this isn't implemented yet.
— Reply to this email directly or view it on GitHub.
Please add a comment when this gets deployed on dev.
Deployed ar-ctrl-edge/bigmem
The control server now keeps track of running jobs in realtime, checking for a heartbeat from the compute server. Stale jobs are purged at a given frequency
arast.conf.monitor.running_job_freq
.Users are limited to queue+run
arast.conf.monitor.running_job_limit
. (default 10)Current counts of the system can be seen via REST: