Open arnikz opened 3 years ago
Interestingly, I can't reproduce the issue using another workflow. For this one, all jobs completed successfully :smile:
$ snakemake --use-conda -j --cluster "xenon -vvv scheduler $SCH --location local:// submit --inherit-env --working-directory ."
...
Tue Aug 24 05:51:28 2021]
rule copy_fastq:
input: input/fastq/sample1/R_2.fastq.bz2
output: output/sample1/R_2.fastq.bz2
log: logs/copy/sample1/R_2_fastq.bz2.log
jobid: 6
wildcards: prefix=sample1/R, pe_reads=2, suffix=fastq.bz2
resources: tmpdir=/tmp
Submitted job 6 with external jobid '05:51:29.217 [main] DEBUG n.e.x.a.s.ScriptingScheduler - creating sub scheduler for at adaptor at local://'.
However, I can't retrieve job accounting info:
xenon --json scheduler $SCH --location local:// list --identifier 6
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.gson.internal.bind.ReflectiveTypeAdapterFactory (file:/home/arnikz/miniconda3/envs/snakemake/lib/xenon-cli-3.0.5.jar) to field java.lang.Throwable.detailMessage
WARNING: Please consider reporting this to the maintainers of com.google.gson.internal.bind.ReflectiveTypeAdapterFactory
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
{
"statuses": [
{
"jobIdentifier": "6",
"state": "UNKNOWN",
"exception": {
"adaptorName": "at",
"detailMessage": "Job 6 could not be found!",
"stackTrace": [],
"suppressedExceptions": []
},
"running": false,
"done": false
}
]
}
I found the bugger - it's related to the workflow itself when using more cores than available. Limiting the number of cores using e.g., snakemake -j 1 ...
solves the issue above.
Good to hear you found the issue. Are you also able to retrieve the job status if the job runs successfully? If not we can look into that issue.
I'll also have a look at the error reporting when you request more cores than available. If would be good if it was clear why the job failed.
Are you also able to retrieve the job status if the job runs successfully? If not we can look into that issue.
Given the example above, my job completed successfully but the job status remains UNKNOWN
. In addition, it would be nice to see the runtime, (peak) memory use etc. as reported for the other schedulers.
Shall we close this job submission issue and open a new one for the job accounting?
I would have to look into getting the stats of the job. It would probably require some external application to provide these (like "time").
The job status should not be UNKNOWN
. I'll have a look.
As it turns out, at
itself was missing ;-)
We should check this and give a proper error message if this is the case.
Hi,
I'm running into this issue with sv-{callers,gen}/xenon workflows. Could you check the commands, perhaps I missed something? Thanks.
However, this arg is listed below (related to #77):
Let's leave it out from the command-line (related to #75):