ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

wtpoa-cns polishing step #98

Closed crysclitheroe closed 5 years ago

crysclitheroe commented 5 years ago

Dear @ruanjue !

Thank you for a great tool! I am trying to do the recommended polishing step with short read data, and Im wondering if the README polishing step can be clarified?

When I ran samtools as recommended and wtpoa-cns with the --x and --d param the programs quit by suggesting to look at the options.

I got them both to run by removing those options and trying:

samtools sort -T tmp_polish -o sr.srt.bam sr.bam samtools view sr.srt.bam | ./wtpoa-cns -t 8 -i prefix.ctg.fa -C 1 -fo prefix_polish.ctg.fa

But then nothing is in prefix_polish.ctg.fa.

Can you help with this?

Thanks and kind regards Crystal

ruanjue commented 5 years ago

Please use -x and -d instead of --x and '--d'.

crysclitheroe commented 5 years ago

Dear raunjue

My apologies this is what I meant, -x and -d. I am still getting the errors /work/BourguignonU/crystal/softwares/wtdbg2/wtpoa-cns: invalid option -- 'x'

I am missing something here; ./wtpoa-cns -t 16 -x sam-sr -d prefix.ctg.fa -i - -fo prefix.ctg.3rd.fa what is sam-sr, -d, and why is the input flag after the input in this case? also the solitary "-", what does that do?

Thanks and kind regards Crystal

ruanjue commented 5 years ago

To avoid discussing on different wtdbg2 version, please download the latest commit "https://github.com/ruanjue/wtdbg2/commit/3549831acf94382379e7855b8c7185b5643347b1".

-x sam-sr, -d prefix.ctg.fa. -i - means -i STDIN. - is often used for STDIN.

Jue

crysclitheroe commented 5 years ago

Thanks for the explanation, and I realize my version is out of date, so will update. But I am still confused about what is "sr-sam" some kind of file previously produced or a parameter argument?

ruanjue commented 5 years ago
$> wtpoa-cns -h
....
 -x <string> Presets, []
             sam-sr: polishs contigs from short reads mapping, accepts sorted SAM files
                     shorted for '-w 200 -j 150 -R 0 -b 1 -c 1 -N 50 -rS 2'
crysclitheroe commented 5 years ago

Thanks for freely sharing incredible software! I just finished my assemblies now they look amazing, even though you said your polishing tool is not as accurate. What other polishing tools would you suggest for higher accuracy?

ruanjue commented 5 years ago

Good polishers includes (not limit to) arrow/quiver, pilon, and nanopolish. I will find more time to improve wtpoa-cns, but not sure as well as them.

mooym001 commented 3 years ago

I was wondering; has the -d option of wpoa-cns been removed? In the manual I read that polishing can be performed with samtools view -F0x900 dbg.bam | ./wtpoa-cns -t 16 -d dbg.raw.fa -i - -fo dbg.cns.fa but using the -d option gives an error for me, and asking for the wpoa-cns -h page only shows:

WTPOA-CNS: Consensuser for wtdbg using PO-MSA Author: Jue Ruan ruanjue@gmail.com Version: 1.1 Usage: wtpoa-cns [options] Options: -t Number of threads, [4] -i Input file(s) .ctg.lay from wtdbg, +, [STDIN] -o Output files, [STDOUT] -f Force overwrite -j Expected max length of node, or say the overlap length of two adjacent units in layout file, [1500] bp -M Match score, [2] -X Mismatch score, [-5] -I Insertion score, [-2] -D Deletion score, [-4] -B Bandwidth, [96] -W Window size in the middle of the first read for fast align remaining reads, [200] If $W is negative, will disable fast align, but use the abs($W) as Band align score cutoff -w Min size of aligned size in window, [$W 0.5] -A Abort TriPOA when any read cannot be fast aligned, then try POA -R Realignment bandwidth, 0: disable, [16] -C Min count of bases to call a consensus base, [3] -F Min frequency of non-gap bases to call a consensus base, [0.5] -N Max number of reads in PO-MSA [20] Keep in mind that I am not going to generate high accurate consensus sequences here -v Verbose

ruanjue commented 3 years ago

Maybe you were invoking very old version wtpoa-cns, wrong PATH?

WTPOA-CNS: Consensuser for wtdbg using PO-MSA
Author: Jue Ruan <ruanjue@gmail.com>
Version: 2.5 (20190621)
Usage: wtpoa-cns [options]
Options:
 -t <int>    Number of threads, [4]
 -d <string> Reference sequences for SAM input, will invoke sorted-SAM input mode
 -u <int>    XORed flags to handle SAM input. [0]
             0x1: Only process reference regions present in/between SAM alignments
             0x2: Don't fileter secondary/supplementary SAM records with flag (0x100 | 0x800)
...
...