Closed bitcometz closed 6 years ago
It is caused by wtpoa-cns failed to get consensus sequence in low coverage region. I have checked one contig less than 5k, the original layout in prefix.ctg.lay is ok, but failed in wtpoa-cns.
I am revising wtpoa-cns, will generate 'consensus' even the read coverage less than 3.
Jue
Thanks !!!
Hello,
I am using Wtdbg2.5 to assemble a worm genome about 130m, heterozygosity ~ 1%
I tried several times and met the same problem about the --ctg-min-legth 5000
,but I can still find some short contigs(about 400bp-1000bp) from prefix.raw.fa
and prefix.cns.fa
I also checked prefix.ctg.lay.gz
but there is no one contig legth < 5000bp ,so I think the main reason is still the process wtpoa-cns
,can you help me to solve it? Thank you!
Best, Bo
Yes, a contig with layout length of >= 5000bp, but may get sequences less than 5k after wtpoa-cns. Reasons:1 variant in estimation of contig length; 2, some layout may fail to call consensus for full length. Please filter shorter contigs after wtpoa-cns
.
Yes, a contig with layout length of >= 5000bp, but may get sequences less than 5k after wtpoa-cns. Reasons:1 variant in estimation of contig length; 2, some layout may fail to call consensus for full length. Please filter shorter contigs after
wtpoa-cns
.
Thank you, I will try to do it the way you said.
Another question: When I use SMARTdenovo
to assemble different coverage of reads ,like 30x 50x 80x 100x and all_reads ,I got separately 123m 139m 158m 170m 181m ;
Compare to Wtdbg2
to assemble 30x 50x 80x 100x and all_reads , I got 133m 136m 136m 136m 141m ;
Taking into account the heterozygosity of the genome,I think this genome might be bigger than expected size,so what kind of dateset and what kind of software do you prefer?
PS: I tried Canu, Falcon, Flye, Mecat2 ...,and only your software can provide the best result, at least genome size, N50.
BTW, usually SMARTdenovo
assemble 400-800 contigs ,but Wtdbg2
assemble 1700-2200 contigs which N90-N100 contains about 1000 contigs ,these small contig resulting in some difficult in Hi-C scaffloding,so can I simply trust SMARTdenovo
to get longer contigs or use Wtdbg2
and filt small contigs?
Sorry for bothering :)
Best , Bo
hello, I found there were always one single sequence in the assembly lower than the minimal contig length(5k), such as 1.7k, 4k, ... for each project.
--ctg-min-length
Min length of contigs to be output, 5000
Thanks