gmteunisse / fantaxtic

Fantaxtic - Nested Bar Plots for Phyloseq Data
26 stars 3 forks source link

Extract data that went into the bar plot #6

Closed pauGuas closed 2 years ago

pauGuas commented 2 years ago

Hi, I need a table showing the relative abundance of each taxon in each sample. Is there a way to extract that data from the fantaxtic bar plot?

I also want to double check.... "Other" just means taxa that weren't in the top most abundant, right? And not that it's just a taxon that isn't identified?

Thank you!

gmteunisse commented 2 years ago

Hi pauGaus,

Thanks for using fantaxtic.

The get_top_taxa function outputs a phyloseq object with that information if relative = T. This does have the non-top taxa collapsed into "Other". To access this table, run the following:

# Get the top 10 taxa and collapse all otehr into "Other". Make abundances relative.
ps_tmp <- get_top_taxa(physeq_obj = physeq_obj, n = 10, relative = TRUE,
                       discard_other = FALSE, other_label = "Other")
rel_abun <- phyloseq::otu_table(ps_tmp)

# The taxon names are stored in the tax table
tax_names <- tax_table(ps_tmp)

# Bind the two together
final_table <- cbind(tax_names, rel_abun)

If you don't want to collapse the taxa into "Other" and want information for all taxa instead, you'll need to calculate relative abundances yourself. Use the following code in base R:

#Get the tax and OTU tables
otu_tbl <- as.data.frame(phyloseq::otu_table(physeq_obj))

#Check the orientation of the otu_tbl and change if required
if (!taxa_are_rows(phyloseq::otu_table(physeq_obj))){
  otu_tbl <- as.data.frame(t(otu_tbl))
}

#Transform absolute taxon counts to relative values
otu_tbl <- apply(otu_tbl, 2, function(x){
      x / sum (x)
})

# The taxon names are stored in the tax table
tax_names <- tax_table(physeq_obj)

# Bind the two together
final_table <- cbind(tax_names, rel_abun)

You are correct that "Other" just means all the taxa that weren't in the top most abundant.

Hope this helps you.

pauGuas commented 2 years ago

@gmteunisse thank you so much for your answer! I am running into an error "Error in cbind2(argl[[i]], r) : number of rows of matrices must match (see arg 2)"

gmteunisse commented 2 years ago

No worries @pauGuas. It's an error in the cbind command, which is telling you that the otu_tbl and the tax_tbl have a different number of rows. Check whether the number of rows are what you expect (nrow(rel_abun); nrow(otu_tbl); nrow(tax_names)).

Which of the two solutions are you using? I see that I made an error in the second code block, the last line should be final_table <- cbind(tax_names, otu_tbl).

pauGuas commented 2 years ago

@gmteunisse I am getting 11 for tax_names and 17 for rel_abun (I am using the top code). I have 17 samples. Not sure why it says 11 for tax_names since I am looking for only the top 10.

gmteunisse commented 2 years ago

Thanks pauGaus. It's 11, because they are the top 10 + "Other". It sounds like the rel_abun table needs to be transposed. Try final_table <- cbind(tax_name, t(rel_abun)). Let me know if that works!

pauGuas commented 2 years ago

@gmteunisse success!! thank you so much

gmteunisse commented 2 years ago

Glad it worked! Good luck with your research.

pauGuas commented 2 years ago

Sorry, one more question. In the table, there is a point where the proportion suddenly becomes whole numbers. See screenshot:

image

gmteunisse commented 2 years ago

Did you cbind multiple times by any chance? I think the table should only be 7 taxonomic ranks + 17 samples = 24 columns, which means that from column X on, samples may be repeated.


From: Pau Fiori @.> Sent: Wednesday, January 26, 2022 9:19:28 PM To: gmteunisse/Fantaxtic @.> Cc: gmteunisse @.>; State change @.> Subject: Re: [gmteunisse/Fantaxtic] Extract data that went into the bar plot (Issue #6)

Sorry, one more question. In the table, there is a point where the proportion suddenly becomes whole numbers. See screenshot:

[image]https://user-images.githubusercontent.com/84878533/151240391-6675e609-e376-416b-9c4b-0059622b0616.png

— Reply to this email directly, view it on GitHubhttps://github.com/gmteunisse/Fantaxtic/issues/6#issuecomment-1022568423, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHBWS3XJC2DN2Y6Y43R7EHTUYBJNBANCNFSM5MOP6L5A. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you modified the open/close state.Message ID: @.***>

pauGuas commented 2 years ago

I tried again from the start and still the same issue.

On Thu, Jan 27, 2022 at 4:30 AM gmteunisse @.***> wrote:

Did you cbind multiple times by any chance? I think the table should only be 7 taxonomic ranks + 17 samples = 24 columns, which means that from column X on, samples may be repeated.


From: Pau Fiori @.> Sent: Wednesday, January 26, 2022 9:19:28 PM To: gmteunisse/Fantaxtic @.> Cc: gmteunisse @.>; State change @.> Subject: Re: [gmteunisse/Fantaxtic] Extract data that went into the bar plot (Issue #6)

Sorry, one more question. In the table, there is a point where the proportion suddenly becomes whole numbers. See screenshot:

[image]< https://user-images.githubusercontent.com/84878533/151240391-6675e609-e376-416b-9c4b-0059622b0616.png

— Reply to this email directly, view it on GitHub< https://github.com/gmteunisse/Fantaxtic/issues/6#issuecomment-1022568423>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AHBWS3XJC2DN2Y6Y43R7EHTUYBJNBANCNFSM5MOP6L5A

. Triage notifications on the go with GitHub Mobile for iOS< https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android< https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . You are receiving this because you modified the open/close state.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/gmteunisse/Fantaxtic/issues/6#issuecomment-1023064317, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUHSJROH4BJIKKDWOB3ODJLUYENDZANCNFSM5MOP6L5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

gmteunisse commented 2 years ago

That's no good. I had another look at the output from get_top_taxa, and it indeed does not return proportions - it only selects by relative abundance, but returns counts. My bad!

The code chunk below should give you the correct table, or at least, it does for me with the GlobalPatterns data. Swap in your own data and it should be good.

require("fantaxtic")

# Load test data
data(GlobalPatterns)

# Get the top taxa and extract the otu_tbl
ps_tmp <- get_top_taxa(physeq_obj = GlobalPatterns, n = 10, relative = TRUE,
                       discard_other = FALSE, other_label = "Other")
otu_tbl <- phyloseq::otu_table(ps_tmp)

#Check the orientation of the otu_tbl and transpose if required
if (!taxa_are_rows(phyloseq::otu_table(physeq_obj))){
  otu_tbl <- as.data.frame(t(otu_tbl))
}

# Calculate relative abundances
rel_abun <- apply(otu_tbl, 2, function(x){
  x / sum (x)
})

# Sanity check: all columns sum to 1?
apply(rel_abun, 2, sum)

# Extract taxon names
tax_names <- tax_table(ps_tmp)

# Combine
final_table <- cbind(tax_names, rel_abun)