PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

The difference between unzip.py and the output shell command in 0-phasing/blasr #87

Closed jiabeiphy closed 7 years ago

jiabeiphy commented 7 years ago

Hi, when I have a question when I use the falcon_unzip-0.4.0.

There are some differences in falcon_unzip.unzip.task_runblasr and the correspond output shell command of aln{ctg_id}.sh.

The line 79# in falcon_unzip.unzip.task_run_blasr, I saw: script = """\ set -vex trap 'touch {job_done}.exit' EXIT cd {wd} hostname date cd {wd} time {blasr} {read_fasta} {ref_fasta} --noSplitSubreads --clipping subread\ --hitPolicy randombest --randomSeed 42 --bestn 1 --minPctIdentity 70.0\ --minMatch 12 --nproc 24 --bam --out tmp_aln.bam

{samtools} view -bS tmp_aln.sam | {samtools} sort - {ctg_id}_sorted

{samtools} sort tmp_aln.bam -o {ctg_id}_sorted.bam {samtools} index {ctg_id}_sorted.bam rm tmp_aln.bam date touch {job_done} """.format(**locals())

but in the aln_{ctg_id}.sh, I don't find the corresponding command, which was show: time blasr 000188F_reads.fa 000188F_ref.fa -noSplitSubreads -clipping subread -hitPolicy randombest -randomSeed 42 -bestn 1 -minPctIdentity 70.0 -minMatch 12 -nproc 24 -sam -out tmp_aln.sam

Could you tell me the reason? Thank you.

pb-cdunn commented 7 years ago

I'm pretty sure you are not looking at the code that generated your .sh file. If you are running with recent pypeFLOW, you can repeat the do_task.py call via task.sh (which reads task.json). From there, you can step into the code which actually generates the shell script and debug. Please re-open if you learn of a bug.