PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
247 stars 43 forks source link

[pb-assembly] I have a problem while running falcon locally #154

Closed changhan1110 closed 5 years ago

changhan1110 commented 5 years ago

Hi,

I am trying to run falcon locally to assemble ecoli genome using provided reads (https://pb-falcon.readthedocs.io/en/latest/tutorial.html)

I am struggling with making falcon configuration file. My job keeps stopping in 0-rawreads/tan-split stage. I think that it is not an issue from environment to run falcon because it worked in the tutorial (https://github.com/PacificBiosciences/pb-assembly#tutorial) (I ran it in SGE)

Could you please give me some idea that make it work well?

Thanks, Changhan

falcon-kit 1.4.2
pypeflow 2.3.0
[General]
#job_type = SGE
job_type=local
# list of fasta files
input_fofn = input.fofn

# input type, raw or pre-assembled reads (preads, error corrected reads)
input_type = raw
#input_type = preads

# The length cutoff used for seed reads used for error correction.
# "-1" indicates FALCON should calculate the cutoff using
# the user-defined genome length and coverage cut off
# otherwise, user can specify length cut off in bp (e.g. 2000)
length_cutoff = 15000
genome_size = 4652500
#seed_coverage = 30

# The length cutoff used for overalpping the preassembled reads 
length_cutoff_pr = 12000

## resource usage ##
jobqueue = bigmem
# grid settings for...
# daligner step of raw reads
sge_option_da = -pe smp 5 -q %(jobqueue)s
# las-merging of raw reads
sge_option_la = -pe smp 20 -q %(jobqueue)s
# consensus calling for preads
sge_option_cns = -pe smp 12 -q %(jobqueue)s
# daligner on preads
sge_option_pda = -pe smp 6 -q %(jobqueue)s
# las-merging on preads
sge_option_pla = -pe smp 16 -q %(jobqueue)s
# final overlap/assembly 
sge_option_fc = -pe smp 24 -q %(jobqueue)s

# job concurrency settings for...
# preassembly
pa_concurrent_jobs = 48
# consensus calling of preads
cns_concurrent_jobs = 48
# overlap detection
ovlp_concurrent_jobs = 48
# daligner parameter options for...
# https://dazzlerblog.wordpress.com/command-guides/daligner-command-reference-guide/
# initial overlap of raw reads
pa_daligner_option =  -v -B4 -t16 -e.70 -l1000 -s1000
# overlap of preads
ovlp_daligner_option = -v -B4 -t32 -h60 -e.96 -l500 -s1000

# parameters for creation of dazzler database of...
# https://dazzlerblog.wordpress.com/command-guides/dazz_db-command-guide/
# raw reads
pa_DBsplit_option = -x500 -s50
# preads
ovlp_DBsplit_option = -x500 -s50

# settings for consensus calling for preads
falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 6

# setting for filtering of final overlap of preads
overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10 --n_core 24
2019-05-29 12:15:38,227 - pwatcher.fs_based:614 - ERROR - Failed to kill job for heartbeat 'heartbeat-P9d82d
d8145deda' (which might mean it was already gone): FileNotFoundError(2, 'No such file or directory')
Traceback (most recent call last):
  File "/BiO/Access/changhan/miniconda3/lib/python3.6/site-packages/pypeflow/simple_pwatcher_bridge.py", lin
e 278, in refreshTargets
    self._refreshTargets(updateFreq, exitOnFailure)
  File "/BiO/Access/changhan/miniconda3/lib/python3.6/site-packages/pypeflow/simple_pwatcher_bridge.py", line 362, in _refreshTargets
    raise Exception(msg)
Exception: Some tasks are recently_done but not satisfied: {Node(0-rawreads/tan-split)}

During handling of the above exception, another exception occurred:
gbdias commented 5 years ago

Hi @changhan1110, not sure if you already figured this out but here are some tips.

## resource usage ##

job_type = local

pwatcher_type = blocking
job_type = string
submit = bash -C ${CMD} >| ${STDOUT_FILE} 2>| ${STDERR_FILE}
NPROC = 48
njobs = 1
MB = 50000
changhan1110 commented 5 years ago

Hi @gbdias I found it already and it worked well. Thank you for your kindness!