ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
199 stars 58 forks source link

Inconsistent results from workflow in tutorial 4.2 trans_venn class #409

Closed jarrodscott closed 1 month ago

jarrodscott commented 1 month ago

Hello,

I am following the tutorial in Chapter 4 Composition-based class using the mock dataset from the microeco package. Everything looks consistent in section 4.1 but once I get to section 4.2 trans_venn class the results look very different from the tutorial. I have tested on both the stable version of microeco (1.9.1) and the dev version (1.9.2) and the outcome is the same.

For example, the first code block:

# merge samples as one community for each group
dataset1 <- dataset$merge_samples("Group")
# dataset1 is a new microtable object
# create trans_venn object
t1 <- trans_venn$new(dataset1, ratio = NULL)
t1$plot_venn()

000076

or this...

# use "Type" column in sample_table
dataset1 <- dataset$merge_samples("Type")
t1 <- trans_venn$new(dataset1)
t1$plot_venn(petal_plot = TRUE, petal_color = RColorBrewer::brewer.pal(8, "Dark2"))
t1$plot_venn(petal_plot = TRUE, petal_center_size = 50, petal_r = 1.5, petal_a = 3, petal_move_xy = 3.8, petal_color_center = "#BEBADA")

00007e

And so on. All of the figures in this part of the tutorial look very different from what you have on-line and the results I am getting do not make sense. I also tried this on my own data but the results do not look correct.

It seems to me like something is happening to the Type samples when merge_samples is run? When I run t1$data_summary the individual samples (NE, NW, NC, YML, SC, QTP) all have zeros for count and abundance.

I hope I am not missing something obvious :)

sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS Ventura 13.6.9
microeco_1.9.1 & microeco_1.9.2
ChiLiubio commented 1 month ago

Hi, Jarrod. Glad to see you again! I guess the dataset in your attached example is the dataset inside the package. The dataset in the trans_venn part comes from the last chapter "microtable". They are different data like the following example shows.

# the dataset inside the package
library(microeco)
data(dataset)
dataset1 <- dataset$merge_samples("Type")
t1 <- trans_venn$new(dataset1)
t1$plot_venn(petal_plot = TRUE, petal_color = RColorBrewer::brewer.pal(8, "Dark2"))
# the dataset from the microtable chapter
library(microeco)
set.seed(123)
dataset <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
dataset$tax_table %<>% base::subset(Kingdom == "k__Archaea" | Kingdom == "k__Bacteria")
dataset$filter_pollution(taxa = c("mitochondria", "chloroplast"))
dataset$tidy_dataset()
dataset$rarefy_samples(sample.size = 10000)
dataset1 <- dataset$merge_samples("Type")
t1 <- trans_venn$new(dataset1)
t1$plot_venn(petal_plot = TRUE, petal_color = RColorBrewer::brewer.pal(8, "Dark2"))
jarrodscott commented 1 month ago

Hello Chi Liu!

Great to hear from you and it is nice to be back. Thank you for your insight. After I wrote this post, I went through the dataset in the package and realized the results were consistent with that dataset. I will go back to the part of the tutorial you referenced and use that dataset instead. I will also close this issue.

It is really great to see this package grow and I am really loving all of the new tools! Great work and many thanks for this amazing toolset. Best, Jarrod

ChiLiubio commented 1 month ago

Thank you very much for your appreciation! I realized that this dataset should be renamed to avoid confusion with the dataset inside the package when running, and I will update the tutorial. Best, Chi