marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
653 stars 179 forks source link

unknown error #1376

Closed apoosakkannu closed 5 years ago

apoosakkannu commented 5 years ago

Hi, I got unknown error while running canu. I have attached the the error in the text file. Could you please give me the possible reason for it. Thanks in advance. CANU_error_2905209.txt

skoren commented 5 years ago

The error message is in the log:

qsub: Old syntax rejected. Please use 'select' syntax.

your grid doesn't like Canu's submit command. Make sure you're using the latest from the repo and update gridEngineResourceOption match what your grid expects, given other PBS systems, something like gridEngineResourceOption="-l select=1:ncpus=THREADS:mem=MEMORY" should work but you should adjust as needed to conform to your grid. THREADS/MEMORY will get auto-replaced by Canu at runtime.

You've also got a lot of non-default options set, I wouldn't recommend running the first assembly that way and would stick with default or one of the recommendations on the FAQ.

apoosakkannu commented 5 years ago

Thanks. May i use the following simple command, canu \ -p assembly \ -d results \ -nanopore-raw nanopore_raw.fastq genomeSiz=108m gridEngineResourceOption="-l nodes=1:ppn=THREADS:mem=MEMORY

is it ok?

skoren commented 5 years ago

Yes, though you may need to change nodes to select and ppn to ncpus, to match however you'd normally submit a job to your grid.

apoosakkannu commented 5 years ago

(STRETCH)apoosakkannu@skirit:~$ canu \

-p assembly \ -d results \ -nanopore-raw nanopore_raw.fastq \ -genomeSize=108m \ gridEngineresourceOption="-l node=select:ncpus=THREADS:mem=MEMORY

I did something like that above, but i am not able to submit the command. I am sorry i am new to this kind of analysis. Could you give some more info for the same.

skoren commented 5 years ago

You'll have to ask your grid administrator how to submit commands, I don't know the configuration of your grid and don't have access to a PBS Pro system.

You could also run with useGrid=false on a reserved node but that will be slower since you'll be limited to one machine.

apoosakkannu commented 5 years ago

Hi, the following is the example of submission into the my server, could you please give the modifications required for the canu,

!/bin/bash

PBS -N myFirstJob

PBS -l select=1:ncpus=4:mem=4gb:scratch_local=10gb

PBS -l walltime=1:00:00

Options above for scheduling system: job will run 1 hour at maximum, 1 machine with 4 processors + 4gb RAM memory + 10gb scratch memory are requested

module add g03 #loads the Gaussian's application modules, version 03 trap 'clean_scratch' TERM EXIT # setup SCRATCH cleaning in case of an error cd $SCRATCHDIR || exit 1 # enters user's scratch directory cp /software/testData/gaussian_test.com $SCRATCHDIR # gets job's input data

g03 results.out # starts the Gaussian application and saves output into results.out

cp results.out /home/$USER || export CLEAN_SCRATCH=false # moves the produced (output) data to user's home directory or leave it in SCRATCH if an error occured

skoren commented 5 years ago

So in that case, what I was suggesting should work, "gridEngineResourceOption="-l select=1:ncpus=THREADS:mem=MEMORYgb", you may also want to add gridOptions="-l walltime=72:00:00" to increase the default runtime given to the jobs. If that gives you an error post the error message.

apoosakkannu commented 5 years ago

I managed to submit my job in the server, but got some error (attached). Could you please give some info and possible way to overcome it? CANU_error_3005209.txt

skoren commented 5 years ago

This looks like you have an older version of Canu, perhaps 1.8 release which had some bugs in PBS support (see #1138). You want to get the latest code from tip not a release and compile it as I mentioned above since there have been several PBS fixes since the 1.8 release.

apoosakkannu commented 5 years ago

Hi, i ran the canu and it completed very quickly. I have attached the log file for your consideration. Could you please check if everything seems ok? CANU_log_06062019.txt

skoren commented 5 years ago

Yes, canu ran part of the pipeline and submitted the next steps to the grid, you can see this at the end of the output:

-- Running jobs.  First attempt out of 2.
--
-- 'meryl-count.jobSubmit-01.sh' -> job 2742099[].wagap-pro.cerit-sc.cz tasks 1-4.
--
----------------------------------------
-- Starting command on Thu Jun  6 15:36:09 2019 with 37226.208 GB free disk space

    cd /mnt/storage-brno2/home/apoosakkannu/polyplaxresults
    qsub \
      -j oe \
      -W depend=afterany:2742099[].wagap-pro.cerit-sc.cz \
      -l select=1:ncpus=1:mem=4g   \
      -N 'canu_polyplaxassembly_canu' \
      -o canu-scripts/canu.01.out  canu-scripts/canu.01.sh
2742100.wagap-pro.cerit-sc.cz

-- Finished on Thu Jun  6 15:36:09 2019 (fast as lightning) with 37226.208 GB free disk space

The assembly is not yet done, this is the way canu runs, it submits computation jobs to the grid followed by the executive script to check the computation results. The latest iteration's output is always named canu.out in your run folder. Once it reports "Bye" the assembly is complete. Since the initial issue on PBS is resolved I'll close this issue, if you encounter other errors with this assembly, open a new issue.