Closed hkaspersen closed 2 months ago
Hi Hakon
thanks for reporting, as this is a bug!
The min_sample_abundance argument is not treating the 0.001 as a proportion. ~I'll fix this in the next version, but~ for now, here is a workaround that hopefully meets your needs.
library(microViz)
#> microViz version 0.11.0 - Copyright (C) 2023 David Barnett
#> ! Website: https://david-barnett.github.io/microViz
#> ✔ Useful? For citation details, run: `citation("microViz")`
#> ✖ Silence? `suppressPackageStartupMessages(library(microViz))`
# built-in example phyloseq data
data("shao19")
# Bug, this is expected to treat values between 0 and 1 as a proportion of counts in each sample
# but it does not do that conversion. Instead, all non-absent taxa pass the threshold.
shao19 %>% tax_filter(min_sample_abundance = 0.01, tax_level = "genus")
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 819 taxa and 1644 samples ]
#> sample_data() Sample Data: [ 1644 samples by 11 sample variables ]
#> tax_table() Taxonomy Table: [ 819 taxa by 6 taxonomic ranks ]
#> phy_tree() Phylogenetic Tree: [ 819 tips and 818 internal nodes ]
# Workaround, transform to compositional then filter
# next, retrieve stored counts (if you don't want to continue with proportions)
shao19 %>%
tax_transform("compositional") %>%
tax_filter(
min_sample_abundance = 0.01, tax_level = "genus", use_counts = FALSE,
prev_detection_threshold = 0 # default of 1 expects counts
) %>%
ps_get(counts = TRUE)
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 651 taxa and 1644 samples ]
#> sample_data() Sample Data: [ 1644 samples by 11 sample variables ]
#> tax_table() Taxonomy Table: [ 651 taxa by 7 taxonomic ranks ]
#> phy_tree() Phylogenetic Tree: [ 651 tips and 650 internal nodes ]
Created on 2023-10-16 with reprex v2.0.2
I am changing the documentation in microViz version 0.12.0 to remove the (erroneous) suggestion that min_sample_abundance could handle proportions. For now, I don't have time to actually add this feature.
for now i'll close this,
I hope to rebuild a better tax_filter in future projects, but for now this is no longer a bug as the current docs no longer erroneously indicate that min_sample_abundance can handle proportions
Hello, and thank you for an excellent R package! I have a dataset and I want to prune taxa that has a relative abundance of less than 0.1% on the genera level (Rank6).
However when I do this, the physeq object is exactly identical after filtering. What am I doing wrong here? The input data is untransformed counts.
physeq object:
microViz package version 0.10.10 R version 4.3.0