ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
181 stars 55 forks source link

Errors in matching the OTU names in the OTU table and the tree #306

Closed haitilin closed 5 months ago

haitilin commented 5 months ago

Hi Chi,

I created a microtable object as follows: Errors occurred when matching the ASV names in the otu_table, tax_table and the tree. I suspect the issue is due to the different orders of ASVs in these tables, but I'm unsure how to resolve it.

rm(list=ls()) pacman::p_load(tidyverse,microeco,magrittr)

feature_table <- read.csv('/Users/haiti/Downloads/asv.csv', check.names = FALSE, row.names = 1) sample_table <- read.csv('/Users/haiti/Downloads/metadata.csv', check.names = FALSE, row.names = 1) tax_table <- read.csv('/Users/haiti/Downloads/tax_table.csv', check.names = FALSE, row.names = 1) tree<- read.tree("/Users/haiti/Downloads/tree.nwk")

as.numeric(colnames(feature_table)) [1] 36811 36841 36861 36871 36874 36921 36941 36951 36961 36981 36991 37011 37021 37031 37041 37051 37061 37071 37081 [20] 37101 37121 37131 37141 37161 37171 37191 37201 37211 37221 37231 37261 37291 37301 37311 37331 37361 37371 37381 [39] 37391 37411 37414 37421 37431 37451 37461 37481 37491 37511 37521 37531 37561 37601 37621 37641 37651 37671 37681 [58] 37711 37731 37734 37751 37761 37771 37781 37791 37801 37811 37881 58211 58321 58371 58374 58391 58394 58461 58471 [77] 58491 58521 58524 58531 58541 58544 58551 58554 58591 58594 58611 58614 58621 58634 58635 58636 58664 58731 58734 [96] 58761 58824 58841 58844 58871 58874 58875 58891 58921 58924 58931 58951 58971 58991 59001 59004 59011 59021 59071 [115] 59111 59131 59134 59135 59136 59141 59151 59154 59161 59164 59531 59781 59791 59821

dataset <- microtable$new(sample_table = sample_table,

  • otu_table = feature_table,
  • tax_table = tax_table,
  • phylo_tree = tree) dataset microtable-class object: sample_table have 128 rows and 4 columns otu_table have 15753 rows and 128 columns tax_table have 15753 rows and 7 columns phylo_tree have 15753 tips

dataset$tidy_dataset() Error in dataset$tidy_dataset() : No same feature name found among otu_table, tax_table and phylo_tree! Please check feature names in those objects!

dataset$cal_betadiv(unifrac = TRUE) Error in GUniFrac::GUniFrac(eco_table, phylo_tree, alpha = c(0, 0.5, 1)) : The OTU table contains unknown OTUs! OTU names in the OTU table and the tree should match!

feature_table[1:5, 1:5] 36811 36841 36861 36871 36874 ASV_1 0 0 1364 218 216 ASV_2 0 0 0 0 0 ASV_3 57 120 527 69 361 ASV_4 1245 0 0 315 860 ASV_5 0 0 287 0 0 tax_table[1:5, 1:3] Kingdom Phylum Class ASV_1 kBacteria pActinobacteria cActinomycetia ASV_2 kBacteria pBacteroidetes c__Bacteroidia ASV_3 kBacteria pActinobacteria cActinomycetia ASV_4 kBacteria pBacteroidetes c__Bacteroidia ASV_5 kBacteria pVerrucomicrobia c__Verrucomicrobiae tree

Phylogenetic tree with 15753 tips and 15617 internal nodes.

Tip labels: 'ASV_14978', 'ASV_15158', 'ASV_15110', 'ASV_14475', 'ASV_14360', 'ASV_15741', ... Node labels: root, 0.110, 1.000, 0.994, 0.070, 1.000, ...

Rooted; includes branch lengths.

ChiLiubio commented 5 months ago

Hi. It seems like there are quotation marks (') in the tip labels of phylo tree, leading to different names against others. To test it, please first do not use phylo tree and create microtable object again. If TURE, it easy to solve it by replacing the characters.

tree$tip.label <- gsub("'", "", tree$tip.label, fixed = TRUE)
haitilin commented 5 months ago

Hi, it worked perfectly, thank you very much!