joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
586 stars 186 forks source link

can I add a condition to merge_samples? #1690

Open lauraDRH opened 1 year ago

lauraDRH commented 1 year ago

I am studying the microbiome of an organism and I have sequenced 3 PCR replicates for each sample.

I want to merge my ESVs reads, but only when they appear in two out of three replicates and I have no idea how to do it. In I way, I would like to se the function _mergesamples() to merge all the replicates, but adding the condition of 2/3 replicates

The df looks something like this:

 >df
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 28974 taxa and 112 samples ]
sample_data() Sample Data:       [ 112 samples by 2 sample variables ]
tax_table()   Taxonomy Table:    [ 28961 taxa by 10 taxonomic ranks ]

>sample_data(df)

Sample Data:        [112 samples by 2 sample variables]:                
            sample   replicate
s1_A        sample1       A
s1_B        sample1       B
s1_C        sample1       C
s2_A        sample2       A
s2_B        sample2       B
s2_C        sample2       C

is there any way to do this? thanks so much!

p.s. thanks for the phyloseq package, it's great!

cresil commented 1 year ago

The following might do what you want. It will prune all taxa which occur no more than once in each of the samples.

subsets <- @._data$sample), function(x) @._data$sample %in% x, df)) # split phyloseq taxa <- unique(unlist(lapply(subsets, function(x) @.**@.!=0)>1))))) # check if taxa_are_rows==FALSE else rowSums df.subset <- prune_taxa(taxa, merge_samples(x = df, group = "sample", fun = sum))

Message ID: @.***>

lauraDRH commented 1 year ago

hi @cresil!

I think something happened with your code but it looks like it is what I am looking for!