david-barnett / microViz

R package for microbiome data visualization and statistics. Uses phyloseq, vegan and the tidyverse. Docker image available.
https://david-barnett.github.io/microViz/
GNU General Public License v3.0
94 stars 10 forks source link

ps_filter by group? #170

Open antoine4ucsd opened 2 weeks ago

antoine4ucsd commented 2 weeks ago

Hello first, thank you for this impressive suite of tools! I am trying to clean up my set of gut microbiome data. this includes multiple subjects and multiple tissues

For filtering, I am considering 2 approaches - not sure which one is optimal

For (1)

# Prevalence threshold
prevalence_threshold <- 0.05  # 5% of samples
# Detection threshold for prevalence
detection_threshold <- 10  # 10 reads
# Total abundance threshold
total_abundance_threshold <- 100  # 100 total reads across all samples
# Sample abundance threshold
sample_abundance_threshold <- 10  # 10 reads in at least one sample
data|>
        tax_filter(min_prevalence = prevalence_threshold, 
                   prev_detection_threshold = detection_threshold, 
                   min_total_abundance = total_abundance_threshold, 
                   min_sample_abundance = sample_abundance_threshold)

is there a way to do it for (2) (group by?)

would you have any advice on the best approach? optimal filtering criteria? I assume (2) will likely be impacted by the prevalence_threshold ...

thank you