bio-guoda / guoda-services

Services provided by GUODA, currently a container for tickets and wikis.
MIT License
2 stars 0 forks source link

jobs may not be getting into the queue #62

Closed jhammock closed 5 years ago

jhammock commented 5 years ago

in my latest round of updating resources for fresh data (started ~26 hours ago) I'm having trouble completing a makeparquet job. I have asked for it three times, each attempt separated by at least a few hours. I can't reach the old logs in my browser/terminal interface, but the most recent attempt went like this:

jhammock@idb-jupyter1:~$ curl -X POST http://mesos07.acis.ufl.edu:7077/v1/submissions/create --header "Content-Type:application/json ;charset=UTF-8" --data @makeparquet.json
{
"action" : "CreateSubmissionResponse",
"message" : "Already reached maximum submission size",
"serverSparkVersion" : "2.2.0",
"success" : false
}

I'm pretty sure that on the first attempt, it was success: true. I've checked http://archive.guoda.bio/view/Fresh%20Data%20jobs/ a few times and never seen any sign of this job in the queue, but I'm not sure it's supposed to be visible there.

jhpoelen commented 5 years ago

@jhammock spark has a queue and jenkins has a queue. The former has a UI, but you can't see it, the later has a UI at http://archive.guoda.bio.

On another note - I am working on this tool dwca2parquet that allows you to run the job yourself on the commandline with no waiting in line. If we can convince @mjcollin to install singularity on the jupyterdb servers, you should be able to create parquets using:

$ dwca2parquet [path to meta.xml in hdfs]

@mjcollin curious to hear your thoughts on this.

jhpoelen commented 5 years ago

@jhammock addendum to the queues - the spark queue is the one that indicated "Already reached maximum submission size" . Is this urgent or can we wait until a simpler method (e.g., command line tool) is available?

jhammock commented 5 years ago

Assuming the urgency question applies to my ability to run this job at all: I have a user waiting for a fresh data update, but we can sit tight for a couple of weeks.

I'm less clear on the "no waiting" method you describe, @jhpoelen , and how it would affect my life. I presume if a job is not waiting, either it jumped the queue or it's using some other resources...

mjcollin commented 5 years ago

Install singularity on idb-juptyer1

jhammock commented 5 years ago

Just attempting next update of Fresh Data:

jhammock@idb-jupyter1:~$ curl -X POST http://mesos07.acis.ufl.edu:7077/v1/submissions/cr eate --header "Content-Type:application/json;charset=UTF-8" --data @makeparquet.json
{
"action" : "CreateSubmissionResponse",
"message" : "Already reached maximum submission size",
"serverSparkVersion" : "2.2.0",
"success" : false
}

Not sure what to expect at this point, but it would help me to know what I aught to do in this case.

jhpoelen commented 5 years ago

Perhaps write a little script that keeps trying every 10s or so until the "success" : true . This is related to #28 .

jhpoelen commented 5 years ago

Or try and convince @mjcollin to install singularity on idb-jupyter1 and run dwca2parquet on the command line, bypassing the queuing mechanism. Another option is to just run spark-shell, run dwca2parquet from the shell . For example see https://github.com/bio-linker/dwca2parquet/blob/master/dwca2parquet.def#L15 where the jar can be retrieved from https://github.com/bio-linker/dwca2parquet/blob/master/dwca2parquet.def#L41 .

jhpoelen commented 5 years ago

From what I can tell, you are competing for resources with http://archive.guoda.bio/job/ecoregion%20status%20checker/ @diatomsRcool , who is using a little script I wrote to automatically submit jobs.

jhammock commented 5 years ago

good to know the jobs won't just accumulate if I keep trying, thanks, @jhpoelen !

@diatomsRcool are your jobs still on a 12h on/12h off schedule of some kind?

diatomsRcool commented 5 years ago

Probably not - I'm trying to ram those ecoregions through just to get them done. I know you've been waiting forever.