Hello
first, thank you for this impressive suite of tools!
I am trying to clean up my set of gut microbiome data. this includes multiple subjects and multiple tissues
For filtering, I am considering 2 approaches - not sure which one is optimal
applying the filtering criteria on all
split the data by tissue and then filter
For (1)
# Prevalence threshold
prevalence_threshold <- 0.05 # 5% of samples
# Detection threshold for prevalence
detection_threshold <- 10 # 10 reads
# Total abundance threshold
total_abundance_threshold <- 100 # 100 total reads across all samples
# Sample abundance threshold
sample_abundance_threshold <- 10 # 10 reads in at least one sample
data|>
tax_filter(min_prevalence = prevalence_threshold,
prev_detection_threshold = detection_threshold,
min_total_abundance = total_abundance_threshold,
min_sample_abundance = sample_abundance_threshold)
is there a way to do it for (2) (group by?)
would you have any advice on the best approach? optimal filtering criteria? I assume (2) will likely be impacted by the prevalence_threshold ...
Hello first, thank you for this impressive suite of tools! I am trying to clean up my set of gut microbiome data. this includes multiple subjects and multiple tissues
For filtering, I am considering 2 approaches - not sure which one is optimal
For (1)
is there a way to do it for (2) (group by?)
would you have any advice on the best approach? optimal filtering criteria? I assume (2) will likely be impacted by the prevalence_threshold ...
thank you