joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
567 stars 187 forks source link

can I add a condition to merge_samples? #1690

Open lauraDRH opened 11 months ago

lauraDRH commented 11 months ago

I am studying the microbiome of an organism and I have sequenced 3 PCR replicates for each sample.

I want to merge my ESVs reads, but only when they appear in two out of three replicates and I have no idea how to do it. In I way, I would like to se the function _mergesamples() to merge all the replicates, but adding the condition of 2/3 replicates

The df looks something like this:

 >df
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 28974 taxa and 112 samples ]
sample_data() Sample Data:       [ 112 samples by 2 sample variables ]
tax_table()   Taxonomy Table:    [ 28961 taxa by 10 taxonomic ranks ]

>sample_data(df)

Sample Data:        [112 samples by 2 sample variables]:                
            sample   replicate
s1_A        sample1       A
s1_B        sample1       B
s1_C        sample1       C
s2_A        sample2       A
s2_B        sample2       B
s2_C        sample2       C

is there any way to do this? thanks so much!

p.s. thanks for the phyloseq package, it's great!

cresil commented 11 months ago

The following might do what you want. It will prune all taxa which occur no more than once in each of the samples.

subsets <- @._data$sample), function(x) @._data$sample %in% x, df)) # split phyloseq taxa <- unique(unlist(lapply(subsets, function(x) @.**@.!=0)>1))))) # check if taxa_are_rows==FALSE else rowSums df.subset <- prune_taxa(taxa, merge_samples(x = df, group = "sample", fun = sum))

Message ID: @.***>

lauraDRH commented 11 months ago

hi @cresil!

I think something happened with your code but it looks like it is what I am looking for!