joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

tax_glom, invalid archetype provided #1704

Open Thyreus opened 1 year ago

Thyreus commented 1 year ago

I have the following phyloseq:

physeq3 phyloseq-class experiment-level object otu_table() OTU Table: [ 323 taxa and 95 samples ] sample_data() Sample Data: [ 95 samples by 4 sample variables ] tax_table() Taxonomy Table: [ 323 taxa by 7 taxonomic ranks ]

tax_table(physeq3) Taxonomy Table: [323 taxa by 7 taxonomic ranks]: Kingdom Phylum Class Order Family Genus Species
Seq1016 "Animalia" "Arthropoda" "Insecta" "Coleoptera" "Histeridae" "Tomogenius" "Tomogenius incisus"
Seq1071 "Animalia" "Arthropoda" "Insecta" "Coleoptera" "Histeridae" "Tomogenius" "Tomogenius incisus"
Seq1171 "Animalia" "Arthropoda" "Insecta" "Coleoptera" "Histeridae" "Tomogenius" "Tomogenius incisus"

But when I try to use the functions"tax_glom" it returns this:

glom <- tax_glom(physeq3, taxrank='Class',NArm = TRUE) Error in merge_taxa.indices.internal(x, eqtaxa, archetype) : invalid archetype provided.

I've read other issues like #1598 and #1110 where it says there is NA. However, I've tried this function with other datasets with NAs, and it worked even then. I've tried it even with the kingdom taxrank which does not have any NAs, but it still gives the same issue.

ycl6 commented 1 year ago

Hi @Thyreus

https://github.com/joey711/phyloseq/blob/c2605619682acb7167487f703d135d275fead748/R/merge-methods.R#L334-L337

I suppose try identify where the NAs are and edit your count table so that NAs are 0.

    library(phyloseq)
    data(GlobalPatterns)

    ps = GlobalPatterns
    otu_table(ps)[1:10, 1:10] # Show first 10 rows, 10 columns
    #> OTU Table:          [10 taxa and 10 samples]
    #>                      taxa are rows
    #>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
    #> 549322   0   0   0       0       0       0       0       0       0       0
    #> 522457   0   0   0       0       0       0       0       0       0       0
    #> 951      0   0   0       0       0       0       1       0       0       0
    #> 244423   0   0   0       0       0       0       0       0       0       0
    #> 586076   0   0   0       0       0       0       0       0       0       0
    #> 246140   0   0   0       0       0       0       0       0       0       0
    #> 143239   7   1   0       0       0       0       3       0       0       0
    #> 244960   0   0   0       0       0       0       0       0       0       0
    #> 255340 153 194   0       0       0       0       0       0       0       0
    #> 144887   3   5   0       0       0       0       0       0       0       0

    # Introduce NAs to OTU table for illustration
    otu_table(ps)[, 3] = NA
    otu_table(ps)[1:10, 1:10]
    #> OTU Table:          [10 taxa and 10 samples]
    #>                      taxa are rows
    #>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
    #> 549322   0   0  NA       0       0       0       0       0       0       0
    #> 522457   0   0  NA       0       0       0       0       0       0       0
    #> 951      0   0  NA       0       0       0       1       0       0       0
    #> 244423   0   0  NA       0       0       0       0       0       0       0
    #> 586076   0   0  NA       0       0       0       0       0       0       0
    #> 246140   0   0  NA       0       0       0       0       0       0       0
    #> 143239   7   1  NA       0       0       0       3       0       0       0
    #> 244960   0   0  NA       0       0       0       0       0       0       0
    #> 255340 153 194  NA       0       0       0       0       0       0       0
    #> 144887   3   5  NA       0       0       0       0       0       0       0

    # Count number of NAs per column (sample)
    na = colSums(apply(otu_table(ps), FUN = is.na, 2))

    # Identify the sample with NA value
    sample_names(ps)[na > 0]
    #> [1] "SV1"

    # Try running tax_glom
    new = tax_glom(ps, "Phylum")
    #> Error in merge_taxa.indices.internal(x, eqtaxa, archetype): invalid archetype provided.

Created on 2023-09-29 with reprex v2.0.2