PacificBiosciences / FALCON

FALCON: experimental PacBio diploid assembler -- Out-of-date -- Please use a binary release: https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
Other
205 stars 103 forks source link

use FALCON in PBS grid #622

Closed tangerzhang closed 6 years ago

tangerzhang commented 6 years ago

Hi, I used the config file (attached below) you suggested in PBS system. But FALCON only works locally, not submit any job to PBS grid. Any suggestion? Thanks!

[General]
use_tmpdir = ./tmp
job_type = PBS
pwatcher_type = blocking
job_type = string
job_queue = bash -C ${CMD}
job_queue = bash -C ${CMD} > ${STDOUT_FILE} 2> ${STDERR_FILE}

input_fofn = input.fofn
#input_fofn = preads.fofn

input_type = raw
#input_type = preads

length_cutoff = 12000

length_cutoff_pr = 12000

jobqueue = high
sge_option_da = -l nodes=1:ppn=8 -q %(jobqueue)s
sge_option_la = -l nodes=1:ppn=2 -q %(jobqueue)s
sge_option_pda = -l nodes=1:ppn=8 -q %(jobqueue)s
sge_option_pla = -l nodes=1:ppn=8 -q %(jobqueue)s
sge_option_fc = -l nodes=1:ppn=24 -q %(jobqueue)s
sge_option_cns = -l nodes=1:ppn=8 -q %(jobqueue)s

pa_concurrent_jobs = 32
ovlp_concurrent_jobs = 32
pa_concurrent_jobs = 6
ovlp_concurrent_jobs = 6

pa_HPCdaligner_option =  -v -B4 -t16 -e.70 -l1000 -s1000
ovlp_HPCdaligner_option = -v -B4 -t32 -h60 -e.96 -l500 -s1000

pa_DBsplit_option = -x500 -s50
ovlp_DBsplit_option = -x500 -s50

falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 6

overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10 --n_core 24
pb-cdunn commented 6 years ago

You can try

[General]
pwatcher_type = fs_based
job_type = sge

The default might work for you.

But if you know how to submit jobs to your PBS, you can specify that explicitly, something like:

[job.defaults]
submit = job_queue =  qsub -S /bin/bash -V -q myqueue \
  -N ${JOB_NAME}        \
  -o "${JOB_STDOUT}" \
  -e "${JOB_STDERR}" \
  -pe smp ${NPROC}    \
  "${JOB_SCRIPT}"
kill = qdel ${JOB_NAME}

[General]
pwatcher_type = fs_based
job_type = pbs

Or as simpler, "blocking" calls, with -W block=true:

[job.defaults]
submit = job_queue =  qsub -S /bin/bash -W block=true -V -q myqueue \
  -N ${JOB_NAME}        \
  -o "${JOB_STDOUT}" \
  -e "${JOB_STDERR}" \
  -pe smp ${NPROC}    \
  "${JOB_SCRIPT}"

[General]
pwatcher_type = blocking

Feel free to update the PBS section here:

akaraw commented 5 years ago

hi everyone,

Could someone please provide me some insights for allocating times in the test.cfg to run in PBS cluster. My esitmated genome size is 1.1GB and I have raw reads coverage of 70x. I selected 30x seed read coverage.

Thank you in advance,

AJ

pb-cdunn commented 5 years ago

The time for each job is controlled by the block size. Smaller blocks => more but quicker jobs.