lmc297 / bactaxR

Bacterial taxonomy construction and evaluation in R
13 stars 4 forks source link

What do the lengths of the edges in the weighted undirected graphs mean? #6

Closed kehletdelgado closed 11 months ago

kehletdelgado commented 12 months ago

Does it represent the ANI of that pair? Or does edge weight represent the ANI?

Thank you!

lmc297 commented 11 months ago

Hello! I'm assuming you're referring to the edges in the graph produced by ANI.graph? If so, the edges simply connect two nodes (genomes) that share an ANI value above your threshold of choice; you can set this using ANI_threshold (by default, ANI_threshold = 95, so two genomes with >= 95 ANI will be connected by an edge, and two genomes < 95 ANI will not be connected by an edge).

kehletdelgado commented 11 months ago

wgs_fix_95_composite.pdf

Thank you for the reply! I have found your R package very useful. Is there any meaning to the lengths of the edges (Why are some longer (and nodes further apart) than others in the attached graph)?

lmc297 commented 11 months ago

I'm so glad to hear that you're finding the package useful! The edge lengths don't really have any biological meaning; it's just a way to visualize pairwise ANI values in two dimensions in a way that looks "nice" (i.e., using the graphopt layout algorithm: https://igraph.org/r/doc/layout_with_graphopt.html).

kehletdelgado commented 11 months ago

That makes sense. Thank you for the information!