vmikk / metagMisc

Miscellaneous functions for metagenomic analysis.
MIT License
44 stars 11 forks source link

Error in validObject(.Object) : invalid class “otu_table” object: OTU abundance data must have non-zero dimensions. #23

Closed jvoneggers closed 1 year ago

jvoneggers commented 1 year ago

Hi Vladimir,

I am trying to use your two functions to extract shared or non-shared otus, but I am getting an error about OTU abundance having non-zero dimensions.

To make my phyloseq object, I did the following:

taxa_tab <- tax_table(tax_tab)
samp_dat <- sample_data(metadata)
otu_tab_norm <- otu_table(otu_tab_10knorm, taxa_are_rows = T)
ps <- phyloseq(otu_tab_norm, samp_dat, taxa_tab)

Then I ran your shared otu function and received the error:

shared_esvs<-phyloseq_extract_shared_otus(ps)

"Error in validObject(.Object) : invalid class “otu_table” object: OTU abundance data must have non-zero dimensions."

I also double checked that the OTU table did not have NAs and double checked my OTU table has reads:

 table(is.na(otu_table(ps)))

   FALSE 
43955446

sum(colSums(otu_table(ps)))
[1] 4916230

I have used this phyloseq object for my other analyses with no problem, so I don't think it has to do with manipulation as an issue #19. Any thoughts would be helpful, thank you!

Jordan

vmikk commented 1 year ago

Hello Jordan! Thank you for reporting this issue!

Could you please check how many shared OTUs are in your data? You may run the following command:

# tax_tab <- as.data.frame(otu_table(ps))        # the same as `tax_tab` from the source data
sum(rowSums(tax_tab > 0) == ncol(tax_tab))

metagMisc was not updated for a while, and I think these functions do not handle the case when there are no shared OTUs (across all samples).

If the result of the previous command is greater than zero, could you please send me a minimal subset of your data (without metadata), so I can diagnose and fix the problem?

With kind regards, Vladimir

jvoneggers commented 1 year ago

Hi Vladmir,

You were right, I ran the code you provided and there are no ESVs that are shared across all samples. I must have misunderstood the purpose of the function, my apologies! I am looking for function that calculates the number of shared ESVs between all pairwise comparisons (resulting in a distance matrix). Thank you for your quick response and time!

Best, Jordan

vmikk commented 1 year ago

function that calculates the number of shared ESVs between all pairwise comparisons (resulting in a distance matrix)

I think that it is doable. Currently, the phyloseq_extract_shared_otus function has a samp_names argument, and it’s possible to provide the names of just two samples to get the pairwise estimate of the number of shared ESVs. This way, you may loop through all pairwise combinations of samples. Although, it may not be very efficient if the number of samples is large.

But first, I will need to fix the function to return zero instead of an error. I'll do it tomorrow, as it is a bit late now.

vmikk commented 1 year ago

Hello Jordan!

I've added a new function to the package for estimation of the number of shared or non-shared OTUs (for all pairwise comparisons between samples). To run it, use:

shared_esvs <- phyloseq_num_shared_otus(ps)

Please update the package to the latest version with:

remotes::install_github(repo = "vmikk/metagMisc")

If your data is large, please install the Matrix package beforehand. It will make estimation much faster and will have a smaller memory footprint. Alternatively, you may use matrix from the base package, but in this case you need to use phyloseq_num_shared_otus(ps, use_Matrix = FALSE).

HTH, Vladimir

jvoneggers commented 1 year ago

Hi Vladmir,

Thank you for adding this! It worked great and SO fast. I appreciate your time and help!

Jordan

vmikk commented 1 year ago

That's great! I'm glad that it works for you!

I will close this issue for now.