Closed SilasK closed 4 years ago
Fixed - thanks! A work-around can use color.fcn
, but I'll push the revised version shortly.
Pushed - you can now input your own colors (will throw a right now probably ambiguous error if input colors
aren't of length equal to pf$nfactors
Thank you very much.
My goal is to emulate the output of Lefse. I think the output of phylofactor are more robust than the LDA score.
For this I would like to summarize the effect of each factor to one value, e.g. the difference between the group mean.
I managed now to plot the factors in the color of the coefficient of the pf.model Phylofactor.pdf
However, I'm not sure what I should do if factors are overlapping. The changes are additive, right?
Should I rather take the output of predict ?
My code so far:
coefficients = vector(length = 10)
for (fofi in 1:length(coefficients))
{
coefficients[fofi] = summary( pf$models[[fofi]])$coefficients['SourceFeces',"Estimate"] #"Pr(>|t|)"
}
map2colors = function(values, endpoints=c("darkblue","white", "darkred"))
{
limit_value= max(abs(values))
## Use n equally spaced breaks to assign each value to n-1 equal sized bins
ii <- cut(values, breaks = seq(-limit_value, limit_value, len = 100),
include.lowest = TRUE)
## Use bin indices, ii, to select color from vector of n-1 equally spaced colors
colors <- colorRampPalette(endpoints)(99)[ii]
## This call then also produces the plot below
#image(seq_along(values), 1, as.matrix(seq_along(values)), col = colors,
# axes = F)
return(colors)
}
factor_colors= map2colors(coefficients)
pf.tree(pf,colors = factor_colors)
I don't fully understand the underlying data & questions, but I think we can make progress without that -
You have a lot of options, and the choice is yours depending on what you're hoping to illustrate. Plotting the raw data with factors highlighted could let the data visualize itself, with clades showing that phylofactor has partitioned groups appropriately. You can also use the pf.predict
output for a cleaner version, and even a side-by-side to see how phylofactor has classified vs. how the data look. However, pf.predict can't use phyca or method='max.var' objects - the predict function uses a glm format. One possibility for your own phyca.predict function is to just take the means within groups summarized (go through your pf.tree()$legend
, pick the factors-x-groups, and take means, geometric, arithmetic, mode or however else you'd like to summarize/aggregate).
Does this help?
I assume this is also a small internal error