bio-guoda / guoda-services

Services provided by GUODA, currently a container for tickets and wikis.
MIT License
2 stars 0 forks source link

job paused? #56

Closed diatomsRcool closed 5 years ago

diatomsRcool commented 5 years ago

My jenkins job has been paused for 3 days. I have the message below. What does this mean?

(pending—idb-jupyter1.acis.ufl.edu is reserved for jobs with matching label expression; moose.acis.ufl.edu is reserved for jobs with matching label expression)

mjcollin commented 5 years ago

Do you have a label in your Jenkins job? agent { label 'hdfs_nfs' }

There are several hosts available to run Jenkins jobs and labels are used to pick which one. It sounds like Jenkins is trying to run on a host that doesn't match its configured label.

Which job is this?

jhammock commented 5 years ago

Interesting. I think the queue is almost caught up now. A job I asked for (http://archive.guoda.bio/job/update%20monitors/120/) just got picked up, but it was waiting alongside http://archive.guoda.bio/job/ecoregion%20status%20checker/ and one or two others for a couple of days while Jenkins reported two idle machines. Is there a place in the Jenkins interface where we can check for labels, and if so, should I have done something differently?

diatomsRcool commented 5 years ago

Probably no label, but it never had a label. This job has been running for a while and only just now having this problem. This is the ecoregion job.

jhpoelen commented 5 years ago

@diatomsRcool Looks like various jobs are waiting in line to be executed. As far as I can tell, the delay in execution is due to a feature in jenkins designed to avoid overloading the system. Looks like we are experiencing the effects of increased usage of archive.guoda.bio !

jhpoelen commented 5 years ago

PS Please do not abort my job "preston-dataone" - due to the interesting architecture of dataone, this job is expected to run for another couple of weeks.

jhpoelen commented 5 years ago

@mjcollin if we can move the working dir of http://archive.guoda.bio/job/preston-dataone/ to the dedicated preston server, I'd be happy to configure the dataone data "observatory" on that machine. This way, we can free-up jenkins from jobs that run for days or weeks.

jhammock commented 5 years ago

FWIW, I've gotten success for jobs sent from the jupyter terminal and from jenkins now, but effechecka queries are not working

jhpoelen commented 5 years ago

Ok, so it sounds like jenkins is working as expected, just busy. @diatomsRcool if you agree, please close the issue.

For all other issues (e.g., jupyter, effechecka), I'd like to suggest to open new specific issues.

jhpoelen commented 5 years ago

Closing issue. @diatomsRcool please re-open the issue if you feel that your issues has not been resolved.

diatomsRcool commented 5 years ago

When should I expect it to pick back up again? Still paused.

jhpoelen commented 5 years ago

From what I can tell, your job is just waiting in line, or "pending" (see attached screenshot) and is executed whenever there's room. Perhaps @mjcollin can help with figuring our another way to better use available compute resources.

screenshot from 2018-10-30 08-45-02 .