joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
567 stars 187 forks source link

Calculate and plot 10 most abundant taxa? #1575

Open LaFra7 opened 2 years ago

LaFra7 commented 2 years ago

Hi all,

I have a phyloseq object and I would like to know and then plot the 10 most abundant taxa at family level for each species.

To give you a better idea of my data, I have the relative abundances of each OTUs found in my samples and my samples are more than 500 divided in three mammal species. So I did:

dr_fam <- tax_glom(dr, taxrank = "Rank1", NArm = F)

Now I would like to have a bar plot (maybe using the plot_bar function?), in which I have the 10 most abundant taxa for each of my mammal species (not for sample). Is it possibile?

I found various issues here, but no one seems to work for me.

Thank you,

LaFra7 commented 2 years ago

I think I found the code to do that:

ps0<-names(sort(taxa_sums(dr_fam), TRUE)[1:10]) #get most abundant ones ps1<-prune_taxa(ps0, dr_fam) ps2 <- transform_sample_counts(ps1, function(x) x / sum(x)) ps3 <- merge_samples(ps2, "Species") ps4 <- transform_sample_counts(ps3, function(x) x / sum(x)) plot_bar(ps4, fill="Rank1")

Is that correct?

And how could I get the exact percentage of each family?

Thank you,

abossers-uu commented 2 years ago

Correct way that should work (pseudo code). Remove zero taxa (subset_taxa) Make relative (transform_sample_counts) Tax_glom at taxrank Order out-table high to low (over columns if taxa are columns) and get first 10 colnames. Subset taxa using these colnames. Plot...

gmteunisse commented 1 year ago

I think fantaxtic does what you are looking for, try running the example below. If it is what you are looking for, replace GlobalPatterns with your phyloseq object, and change the grouping factor to the appropriate column for mammal species in your data.

devtools::install_github("gmteunisse/fantaxtic")
require("fantaxtic")
require("phyloseq")
data(GlobalPatterns)
top <- top_taxa(GlobalPatterns, 
                          tax_level = "Family", 
                          n_taxa = 10,
                          grouping = "SampleType")
plot_nested_bar(top$ps_obj, top_level = "Phylum", nested_level = "Family")
slambrechts commented 6 months ago

@gmteunisse I tried your fantaxtic code, but I get:

Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

Any idea what might be the cause?

gmteunisse commented 6 months ago

I can try to figure it out. Can you post a reproducible example of your error in an issue on the fantastic GitHub page?


From: Sam Lambrechts @.> Sent: Tuesday, December 12, 2023 11:08:40 AM To: joey711/phyloseq @.> Cc: gmteunisse @.>; Mention @.> Subject: Re: [joey711/phyloseq] Calculate and plot 10 most abundant taxa? (Issue #1575)

@gmteunissehttps://github.com/gmteunisse I tried your fantaxtic code, but I get:

Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

Any idea what might be the cause?

— Reply to this email directly, view it on GitHubhttps://github.com/joey711/phyloseq/issues/1575#issuecomment-1851726768, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHBWS3W4UNBAZWTC5KI5RJ3YJAUKRAVCNFSM5SCHYSJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBVGE3TENRXGY4A. You are receiving this because you were mentioned.Message ID: @.***>