lch14forever / microbiomeViz

Visualize microbiome data with black magic ggtree
68 stars 14 forks source link

clade.anno() doesn't annotate correctly #6

Closed zhuchcn closed 6 years ago

zhuchcn commented 6 years ago

Hi, thanks for making this package! And the new version is much better! I still have two issues using the clade.anno function. I customized it myself, but maybe you should consider fixing them in the future.

The first issue is when I try to annotate the tree with a dataframe, the calde.anno function does not label the nodes correctly. The reason is in the visualizer.R, line 29, that the node_ids are actually not in the same order as the nodes originally in the anno.data passed in. So I just sorted them and it worked out fine.

And also, the layout of the annotation text on the side is very awkward. If I increase the anno.y, part of the text will disappear because of the nature of geom_text (I think).

Plus, it will also be cool to add legends for the colors.

lch14forever commented 6 years ago

Thanks for the feedback. I will take a look and fix this.

lch14forever commented 6 years ago

Hi,

There doesn't seem to be a problem for me. Do you mind posting your fixed lines here?

zhuchcn commented 6 years ago

See below:

get_angle <- function(node){...}
anno.data = arrange(anno.data, node)                                          # new added
hilight.color <- anno.data$color
node_list <- anno.data$node
node_ids <- (gtree$data %>% 
             filter(label %in% node_list ) %>%
             arrange(label)                                                   # new added
             )$node
anno <- rep('white', nrow(gtree$data))

I just added the 2 lines above and then it worked. Maybe there is something particular in my data. But it's perhaps good to make sure the order is correct any way.

lch14forever commented 6 years ago

I am just guessing. Is this because the node names were parsed as factors in anno.data? Will it work if you create a data frame with stringsAsFactors = FALSE. For example,

anno.data <- data.frame(node=c("g__Roseburia", "c__Clostridia", "s__Bacteroides_ovatus"),
                       color='red', stringsAsFactors = FALSE)
zhuchcn commented 6 years ago

I actually do have the stringsAsFactor = FALSE and I really don't think that's the reason. When you filter the gtree$data, it only searches through the gtree$data$label and find the ones also present in the node_list, but it doesn't guarantee they are in the same order. If you compare filter(gtree$data, label %in% node_list)$label to node_list, they aren't the same.

lch14forever commented 6 years ago

Ok. I will correct it as you suggested. Thanks for the feedback again.

zhuchcn commented 6 years ago

This is a very nice package! I'm glad to help! I just have one more suggestion. I have a very long list to annotate, so I tried but failed to move the annotation (with level <=3) on the right side of the plot. And also the left side of the plot will have am ugly big blank space. So I found using legends instead of geom_text() is a lot more flexible, because now they are plotted in the margin area. And then you can use theme to adjust the margins. You can now cut off the left margin, but keep the right margin big enough to fit the annotations.

lch14forever commented 6 years ago

That is exactly I wanted to do at the beginning. You might want to open a pull request if you already have the implementation. Also feel free to add yourself to the author list :-).

zhuchcn commented 6 years ago

Great idea! I do have the implementation already. I can start doing that when I come back from this conference that I'm going to next week.

lch14forever commented 6 years ago

Hi Chenghao,

Please let me know your thoughts on this: https://github.com/lch14forever/microbiomeViz/issues/8#issuecomment-416089273

Regards, Chenhao.