Closed javiercguard closed 2 years ago
It is a problem of multiple sequence alignment, bsalign poa
will help you.
https://github.com/ruanjue/bsalign
I've been trying bsalign, but the consensus was longer than I desired, I decided to use wtdbg2 passing longer sequences. Thanks!
Hi, I'm not sure if this is possible to do, but I'd like to assemble short sequences (extracted from nanopore reads, so they are noisy) into a consensus. The length of the sequences could be ~150 bp. For example, 8 sequences in the length range 145-175bp. I've set -L to 100. It loads the "reads", but it generates no k-mers:
Log:
``` -- Starting program: wtdbg2 -i fasta.fasta -f -o wtest/test -x ont -L 100 -g 166 -- pid 130567 -- date Thu Feb 24 18:34:51 2022 -- [Thu Feb 24 18:34:51 2022] loading reads 15 reads [Thu Feb 24 18:34:51 2022] Done, 15 reads (>=100 bp), 2502 bp, 0 bins ** PROC_STAT(0) **: real 0.009 sec, user 0.000 sec, sys 0.000 sec, maxrss 1040.0 kB, maxvsize 86220.0 kB [Thu Feb 24 18:34:51 2022] Set --edge-cov to 2 KEY PARAMETERS: -k 15 -p 0 -K 1000.049988 -A -S 2.000000 -s 0.050000 -g 166 -X 50.000000 -e 2 -L 100 [Thu Feb 24 18:34:51 2022] generating nodes, 4 threads [Thu Feb 24 18:34:51 2022] indexing bins[(0,0)/0] (0/0 bp), 4 threads [Thu Feb 24 18:34:51 2022] - scanning kmers (K15P0S2.00) from 0 bins 0 bins ********************** Kmer Frequency ********************** ********************** 1 - 201 ********************** Quatiles: 10% 20% 30% 40% 50% 60% 70% 80% 90% 95% 0 0 0 0 0 0 0 0 0 0 ** PROC_STAT(0) **: real 0.009 sec, user 0.000 sec, sys 0.000 sec, maxrss 1040.0 kB, maxvsize 86220.0 kB [Thu Feb 24 18:34:51 2022] - high frequency kmer depth is set to 65535 [Thu Feb 24 18:34:51 2022] - Total kmers = 0 [Thu Feb 24 18:34:51 2022] - average kmer depth = 0 [Thu Feb 24 18:34:51 2022] - 0 low frequency kmers (<2) [Thu Feb 24 18:34:51 2022] - 0 high frequency kmers (>65535) [Thu Feb 24 18:34:51 2022] - indexing 0 kmers, 0 instances (at most) 0 bins [Thu Feb 24 18:34:51 2022] - indexed 0 kmers, 0 instances [Thu Feb 24 18:34:51 2022] - masked 0 bins as closed [Thu Feb 24 18:34:51 2022] - sorting ** PROC_STAT(0) **: real 0.009 sec, user 0.000 sec, sys 0.000 sec, maxrss 1040.0 kB, maxvsize 86220.0 kB [Thu Feb 24 18:34:51 2022] Done 0 reads|total hits 0 ** PROC_STAT(0) **: real 0.009 sec, user 0.000 sec, sys 0.000 sec, maxrss 1040.0 kB, maxvsize 86220.0 kB [Thu Feb 24 18:34:51 2022] sorting rdhits ... Done [Thu Feb 24 18:34:51 2022] clipping ... -nan% bases [Thu Feb 24 18:34:51 2022] generating regs ... 0 [Thu Feb 24 18:34:51 2022] sorting regs ... Done [Thu Feb 24 18:34:51 2022] generating intervals ... 0 intervals [Thu Feb 24 18:34:51 2022] selecting important intervals from 0 intervals [Thu Feb 24 18:34:51 2022] Intervals: kept 0, discarded 0 ** PROC_STAT(0) **: real 0.009 sec, user 0.000 sec, sys 0.000 sec, maxrss 1040.0 kB, maxvsize 86220.0 kB [Thu Feb 24 18:34:51 2022] Done, 0 nodes [Thu Feb 24 18:34:51 2022] output "wtest/test.1.nodes". Done. [Thu Feb 24 18:34:51 2022] median node depth = 0 [Thu Feb 24 18:34:51 2022] masked 0 high coverage nodes (>200 or <2) [Thu Feb 24 18:34:51 2022] masked 0 repeat-like nodes by local subgraph analysis [Thu Feb 24 18:34:51 2022] generating edges [Thu Feb 24 18:34:51 2022] Done, 1 edges [Thu Feb 24 18:34:51 2022] output "wtest/test.1.reads". Done. [Thu Feb 24 18:34:51 2022] output "wtest/test.1.dot.gz". Done. [Thu Feb 24 18:34:51 2022] graph clean [Thu Feb 24 18:34:51 2022] rescued 0 low cov edges [Thu Feb 24 18:34:51 2022] deleted 0 binary edges [Thu Feb 24 18:34:51 2022] deleted 0 isolated nodes [Thu Feb 24 18:34:51 2022] cut 0 transitive edges [Thu Feb 24 18:34:51 2022] output "wtest/test.2.dot.gz". Done. [Thu Feb 24 18:34:51 2022] deleted 0 isolated nodes [Thu Feb 24 18:34:51 2022] output "wtest/test.3.dot.gz". Done. [Thu Feb 24 18:34:51 2022] cut 0 branching nodes [Thu Feb 24 18:34:51 2022] deleted 0 isolated nodes [Thu Feb 24 18:34:51 2022] building unitigs [Thu Feb 24 18:34:51 2022] [Thu Feb 24 18:34:51 2022] output "wtest/test.frg.nodes". Done. [Thu Feb 24 18:34:51 2022] generating links [Thu Feb 24 18:34:51 2022] generated 1 links [Thu Feb 24 18:34:51 2022] output "wtest/test.frg.dot.gz". Done. [Thu Feb 24 18:34:51 2022] rescue 0 weak links [Thu Feb 24 18:34:51 2022] deleted 0 binary links [Thu Feb 24 18:34:51 2022] cut 0 transitive links [Thu Feb 24 18:34:51 2022] remove 0 boomerangs [Thu Feb 24 18:34:51 2022] remove 0 weak branches [Thu Feb 24 18:34:51 2022] cut 0 tips [Thu Feb 24 18:34:51 2022] pop 0 bubbles [Thu Feb 24 18:34:51 2022] detached 0 repeat-associated paths [Thu Feb 24 18:34:51 2022] cut 0 tips [Thu Feb 24 18:34:51 2022] output "wtest/test.ctg.dot.gz". Done. [Thu Feb 24 18:34:51 2022] building contigs [Thu Feb 24 18:34:51 2022] searched 0 contigs [Thu Feb 24 18:34:51 2022] Estimated: [Thu Feb 24 18:34:51 2022] output 0 contigs [Thu Feb 24 18:34:51 2022] Program Done ** PROC_STAT(TOTAL) **: real 0.109 sec, user 0.050 sec, sys 0.050 sec, maxrss 43888.0 kB, maxvsize 422720.0 kB --- ```I've tried using low values for -K, -e, omitting -g, to no avail. Is it possible to generate a small assembly on purpose?
Thanks!