joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
586 stars 186 forks source link

Heatmap color fade #1756

Closed goodguynickpt closed 3 months ago

goodguynickpt commented 5 months ago

Hello, everyone!

I hope everyone's having a good day!

I am currently attempting to obtain a heatmap of my data. Even though I have (somewhat) suceeded, it is somewhat unpleasant to look at. I would like to have the colors fade from blue to red (depending on the abundance), but I have not been able to get it to work.

As of right now, I have these two hedious heatmaps: Heatmap_top30_families heatmap_top50_families

They are not easy to read with these many families.

I am currently using this code:

##################################

11. Heatmap

##################################

Family Only

ps_family <- tax_glom(ps, taxrank = "Family"); ps_family #swap family for whichever taxa is needed family_abundance <- taxa_sums(ps_family) # Calculate the total abundance of each family top_families <- names(sort(family_abundance, decreasing = TRUE))[1:50] # Select the top 100 families based on their abundance ps_top100 <- prune_taxa(top_families, ps_family) # Subset the phyloseq object to include only these top 100 families

Plotting using the phyloseq built-in plotting function

plot_heatmap(ps_top100, method = "NMDS", distance ="bray", sample.label = "Codes", # X axis taxa.label = "Family", # Y axis sample.order = (1:12)) # orders the samples

taxa.order = "Family"

         # )

I have also tried the following script to make it fade from color to color but to no avail (I get the exact same map):

##################################

11. Heatmap

##################################

Ensure correct packages are installed and loaded

if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } if (!requireNamespace("phyloseq", quietly = TRUE)) { BiocManager::install("phyloseq") } if (!requireNamespace("ggplot2", quietly = TRUE)) { install.packages("ggplot2") }

Load the necessary packages

library(phyloseq) library(ggplot2)

Family Only

ps_family <- tax_glom(ps, taxrank = "Family") # Aggregate at the Family level family_abundance <- taxa_sums(ps_family) # Calculate the total abundance of each family top_families <- names(sort(family_abundance, decreasing = TRUE))[1:50] # Select the top 50 families based on their abundance ps_top100 <- prune_taxa(top_families, ps_family) # Subset the phyloseq object to include only these top 50 families

Correct labels

correct_labels <- paste0(metadata$Sample, "-", metadata$Tissue, "-", metadata$Infection)

Ensure that the correct labels are assigned to the sample data

sample_data(ps_top100)$correct_labels <- correct_labels

Plotting using the phyloseq built-in plotting function and modify the ggplot object

p <- plot_heatmap(ps_top100, method = "NMDS", distance = "bray", sample.label = "correct_labels", # Use the custom labels column taxa.label = "Family", # Y axis sample.order = sample_data(ps_top100)$correct_labels) # Use custom labels for ordering

Modify the color gradient in the ggplot object

p <- p + scale_fill_gradient(low = "blue", high = "red", na.value = "white") # Set NA values to white if needed

Display the plot

print(p)

Does anyone know how to fix my code in order to achieve this? Thank you so much!

Cheers, João Lucas