Closed c4derpillar closed 5 years ago
I've attached a patched file that should fix this, but, unfortunately, I have no way to test. Uncompress it in src/pipelines/canu/, then 'make' in src/ to install it.
To restart, remove the correction/0-mercounts directory (there are only some shell scripts in there right now) and then rerun the same canu command.
Hi! I have also gotten the same error as c4derpillar ("failed to find the number of jobs in 'correction/0-mercounts/meryl-count.sh'") with v1.8, but I don't get the error with v1.7.1 using the exact same canu commands. I tried to use the Execution.pm you provided above (thank you!), but unfortunately, when I try to install v1.8 from source I get the error "make: *** No rule to make target `install'.", and it creates the Linux-amd64/bin folder but it is empty; I don't get this error when I install v1.7.1 from source though. If this is an unrelated issue and should be listed as a new issue, please let me know.
@manabanana does this occur when doing just 'make' - I think you're doing 'make install' which isn't needed (or supported). If it's still failing, yes, please do make a new issue. But if it works, and the patch works, hooray!
@brianwalenz Unfortunately, the installation does not occur when I just use 'make' either. It creates the Linux-amd64/bin folder but the bin folder is empty. I will start a new issue.
Thanks for your help Brian, it is now submitting the jobs, but still fails at a later stage of Meryl.
It is creating 2 output files now: meryl-count.jobSubmit-01.out ecoli.ms16.config.01.out
EDIT:
Looks like this is the issue now, this is after deleting the 'ecoli-oxford' folder every time I re-run the command. The meryl-count.sh file is there, and the config file has detected the working directory correctly:
########################### Execution Started #############################
JobId:66424.flm1
UserName:taylorwass
GroupName:qris-uq
ExecutionHost:fl018
###############################################################################
pbs_mom, exec of ./meryl-count.sh failed with error: No such file or directory
########################### Job Execution History #############################
JobId:66424.flm1
UserName:taylorwass
GroupName:qris-uq
JobName:meryl_ecoli
SessionId:18926
ResourcesRequested:mem=100gb,ncpus=4,place=free,walltime=20:00:00
ResourcesUsed:cpupercent=0,cput=00:00:00,mem=0kb,ncpus=4,vmem=0kb,walltime=00:00:02
QueueUsed:Short
AccountString:UQ-AIBN
ExitStatus:254
###############################################################################
As far as I can tell, everything is fine in ecoli.ms16.config.01.out, and a third job is started, however it ends after 2 seconds and I am unable to find its output/error files. After waiting an hour, no new files are appearing in canu-logs or in 0-mercounts.
Unfortunately our PBS setup does not allow for searching active/historical jobs by username, so it is a bit hard to keep track of what is running/what stage it is failing at.
Thank you!
@brianwalenz
bf5a93b fixed the correction/0-mercounts/meryl-configure.sh
, but broke other commands including the following correction/0-mercounts/meryl-count.sh
and the master script canu-scripts/canu-01.sh
.
It failed to cd to working directory and
rm -f canu.out
ln -s canu-scripts/canu.01.out canu.out
created the empty link under my $HOME
.
I guess that the $PBS_ARRAY_INDEX
cannot suit every command.
I have same
ABORT: failed to find the number of jobs in 'unitigging/0-mercounts/meryl-count.sh'.
problem with PBSpro here. I start the job from within the working directory so IMO the issue is elsewhere.
In the cwd I have many tt_16D1C3L12.ms22.config.*.out files, each ending with:
Don't know what to do with '../../tt_16D1C3L12.seqStore'.
The files contain just the general help text how to call meryl but NOT the actual (broken) command with the arguments, so I am blind.
Also, the ./unitigging/0-mercounts/meryl-count.sh
file contains no value as an argument to memory=
:
# And compute.
/scratch/work/project/bio/canu-1.8/Linux-amd64/bin/meryl k=22 threads=8 memory= \
count \
segment=$jobid/ ../../tt_16D1C3L12.seqStore \
output ./tt_16D1C3L12.$jobid.meryl.WORKING \
&& \
mv -f ./tt_16D1C3L12.$jobid.meryl.WORKING ./tt_16D1C3L12.$jobid.meryl
exit 0
I used gridEngine="pbspro" gridOptions="-A xx-xx -q qlong" gridEngineThreadsOption="-l select=1:ncpus=THREADS,walltime=144:00:00" useGrid=True
as options to canu-1.8.
BTW, the documentation is insufficient. I allocated 10 nodes with 24 CPUs each. Canu top-level process properly recorded:
-- Detected 24 CPUs and 126 gigabytes of memory.
-- Detecting PBSPro resources.
--
-- Found 1 host with 8 cores and 123 GB memory under PBSPro control.
-- Found 2 hosts with 28 cores and 504 GB memory under PBSPro control.
-- Found 12 hosts with 8 cores and 247 GB memory under PBSPro control.
-- Found 1007 hosts with 24 cores and 125 GB memory under PBSPro control.
But should the command gridEngineThreadsOption
define values to execute a "task" on a single node or on all those 10 nodes? How does that correlate with the resources actually assigned to my PBS job?
$ qstat -f 8864969.isrv5
...
Resource_List.ncpus = 240
Resource_List.nodect = 10
Resource_List.select = 10:ncpus=24
Resource_List.walltime = 144:00:00
Same here. Traced the problem back to Meryl.pm
, where in this line
print STDERR "-- Will count kmers using $merylSegments jobs, each using $merylMemory GB and $thr threads.\n";
both $merylMemory
and $merylSegments
are still undefined.
This leads to memory=
without value when the meryl-count.sh
is generated, probably from here:
print F "$bin/meryl -C k=$merSize threads=$thr memory=$mem \\\n";
Will give 1.7 a try, thanks for mentioning that it works.
Potentially fixed in e13467a0ada171b9f70dd9ea615452cd707ea0ac and df2dbf1df0fa8fd0a98a6f281dafcb131f57dc64.
Hi there,
Canu runs one of node on my HPC, however it fails when using it with the PBSPro grid, jobs seem to start and then Meryl errors regarding being unable to 'find the number of jobs'.
Any idea what is happening? There is a strange line in ecoli5.ms16.config.01.out:
My canu.out looks like this:
Thanks!