joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
567 stars 187 forks source link

Rename ASVs in otu_table as sub-types #1725

Open AngelicaMiraples opened 4 months ago

AngelicaMiraples commented 4 months ago

Hello!

I am working in Rstudio and have found that a small number of my ASVs are likely subtypes: They have the same taxonomic assignment and are 100% exact matches when blasted, but only vary in length by one or two nucleotides.

I would like to rename such ASVs with a "-1" ,"-2" etc or "a", "b" etc to distinguish them but I'm having a bit of trouble.

ps phyloseq-class experiment-level object otu_table() OTU Table: [ 1702 taxa and 172 samples ] sample_data() Sample Data: [ 172 samples by 7 sample variables ] tax_table() Taxonomy Table: [ 1702 taxa by 7 taxonomic ranks ] phy_tree() Phylogenetic Tree: [ 1702 tips and 1701 internal nodes ] refseq() DNAStringSet: [ 1702 reference sequences ]

I first extract the otu table from the ps object otu <- otu_table(ps) # taxa are rows = true

Then I check to make sure the ASV is present as a row name otu["ASV1"]

Here is an example of what is returned for samples 10-20 out of 172

otu["ASV1", 10:20] OTU Table: [1 taxa and 11 samples] taxa are rows ER5 ER6 ER7 ER8 ER9 LF1 LF10 LF2 LF4 LF5 LF6 ASV1 0 0 0 0 0 15 9 5 9 131 16

To rename ASV1 to ASV1-1 and check: rownames(otu)[rownames(otu) == "ASV1"] = "ASV1-1" otu["ASV1-1"]

otu["ASV1-1", 10:20] OTU Table: [1 taxa and 11 samples] taxa are rows ER5 ER6 ER7 ER8 ER9 LF1 LF10 LF2 LF4 LF5 LF6 ASV1-1 0 0 0 0 0 15 9 5 9 131 16

Replacing existing otu table with new ASV1-1 containing table otu_table(ps) <- otu

When checking if I can index ASV1-1, there is an error.

otu_table(ps)["ASV1-1"] Error in as(x, "matrix")[i, j, drop = FALSE] : subscript out of bounds

And I am left with one taxa less than when I started:

ps phyloseq-class experiment-level object otu_table() OTU Table: [ 1701 taxa and 172 samples ] sample_data() Sample Data: [ 172 samples by 7 sample variables ] tax_table() Taxonomy Table: [ 1701 taxa by 7 taxonomic ranks ] phy_tree() Phylogenetic Tree: [ 1701 tips and 1700 internal nodes ] refseq() DNAStringSet: [ 1701 reference sequences ]

Any idea why this ASV is being removed after making this modification? Any suggestions how to fix this?

Best, Angelica

AngelicaMiraples commented 4 months ago

Just realized, I would probably need to change all the other components of the ps object in addition to the otu_table, such as the tax_table, phy_tree, and refqseq. Is there perhaps an easier way to do this?

benjjneb commented 4 months ago

@AngelicaMiraples I think you want to use taxa_names accessor to get/set taxa names robustly, across all the relevant components of the phyloseq object. When you modify just the otu_table, it doesn't affect the other components, and it will either break the phyloseq object, or the mismatched taxa names will be removed. This seems to work for me:

taxa_names(ps)[[X]] <- "foo"
AngelicaMiraples commented 4 months ago

Hi Benji, thanks for your reply! Will try this out :)

Best, Angelica