ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

Reg: Colors in *.dot.gz and other queries #136

Closed harish0201 closed 5 years ago

harish0201 commented 5 years ago

Hi!

I have finished assemblies for a couple of large genomes which I have finished recently. I was just perusing the files generated and I'm curious about the .dot files generated at each step (frg, ctg etc).

Can you explain the color codes in these files? I tried converting the same using dot but it seems to hang.

Edit:

I have a few other questions:

In the log, there are multiple graph cleaning and rescuing steps. Is it possible to dump out those nodes/edges in a separate file or retrieve them for other analyses?

Thank you!

ruanjue commented 5 years ago
static const char *colors[2][2] = {{"blue", "green"}, {"red", "gray"}};

++ => blue
+- => green
-+ => red
-- => gray

I am not sure of your second question.

Jue

harish0201 commented 5 years ago

Thanks for the response, but I meant what did the colors stand for?

Are blue the nodes or edges and so on?

For the second question: [Thu Jun 27 16:46:31 2019] rescued 7761 low cov edges [Thu Jun 27 16:46:33 2019] deleted 2848 binary edges [Thu Jun 27 16:46:34 2019] deleted 94004 isolated nodes [Thu Jun 27 16:46:41 2019] cut 347896 transitive edges [Thu Jun 27 16:46:41 2019] output "prefix.2.dot.gz". Done. [Thu Jun 27 16:53:02 2019] 80627 bubbles; 62872 tips; 364 yarns; [Thu Jun 27 16:53:03 2019] deleted 110709 isolated nodes [Thu Jun 27 16:53:03 2019] output "prefix.3.dot.gz". Done. [Thu Jun 27 16:53:11 2019] cut 48825 branching nodes [Thu Jun 27 16:53:11 2019] deleted 11326 isolated nodes [Thu Jun 27 16:53:11 2019] building unitigs [Thu Jun 27 16:53:12 2019] TOT 770460160, CNT 39074, AVG 19718, MAX 1184512, N50 44288, L50 3400, N90 8192, L90 19034, Min 512

Is is possible to know where the edges and nodes are deleted or rescued from?

ruanjue commented 5 years ago

The colors represent orientation of edges.

N12692 -> N24294 [label="+-:35:3" color=green]

It means, in the forward strand N12692 goes to N24294's reverse strand. Follow this travel, N24294 should find it out-edge having orientation of -+ or --.

The second, it is possible to record all the actions on graph, but will be complicated. If you are interested, please have a look at cut_edge_core_graph, revive_edge_graph, del_node_graph in wtdbg-graph.h. You can insert codes to capture those actions.

Jue

harish0201 commented 5 years ago

Hi!

Sorry for the late ping. I'll check into this. For the meantime, I'll close the issue.