Closed chklopp closed 5 years ago
There are three files named as <prefix>.1/2/3.dot.gz
. 1 is the initial graph, 2 is after reduce transitive edges (clean to be viewed), 3 is final graph. There are two scripts: scripts/dbm_index_dot.pl
and scripts/dbm_read_dot.pl
. <prefix>.frg.nodes
will give the node names of unitigs. In the end of <prefix>.events
, you will see how contigs come from unitigs (named F\d+).
In one word, a contig -> unitigs -> nodes -> dot graph.
pgzf -fd dbg.*.dot.gz
dbm_index_dot.pl dbg.3.dot
head -1 dbg.frg.nodes # suppose the end node is 'N111'
dbm_read_dot.pl -l 20 dbg.3.dot N111 >1,dot && dot -Tpdf -O 1.dot # see 1.dot.pdf
Jue
Thank you I succeeded drawing the contig node graph but it does not really answer my question. How do I know why the last nodes are not finding a or the (next) neighbor?
The event file is cryptic \:
F18[-:1] -> F48[+:0] = 11008, cov=1 F18[-:1] -> F48[+:0] = 11264, cov=1 F18[-:1] -> F48[+:0] = 11264, cov=1 F18[-:0] -> F48[+:0] = 24576, cov=1 F18[-:1] -> F48[+:0] = 24832, cov=1 F35[-:0] -> F55[+:0] = 11008, cov=1 F35[-:0] -> F55[+:0] = 11264, cov=1
Please pay attention on those lines at the end of file.
ctg0 F0 - 0
ctg1 F1 - 0
OUTPUT_CTG ctg0 -> ctg1 nodes=4958 len=5219328
OUTPUT_CTG ctg1 -> ctg2 nodes=101 len=148992
Is there a way to know why the assembler has decided to end a contig? There are possibly several reasons :
Can this information be retrieved from wtdbg2 files?