mothur / mothur.github.io

wiki for the mothur software package
https://mothur.github.io
Creative Commons Attribution 4.0 International
19 stars 20 forks source link

Question on: merge.groups #119

Open FrancoMagurno opened 10 months ago

FrancoMagurno commented 10 months ago

Hi there, I would like to compare different treatments which have a different number of samples, and each sample with different number of sequences. First I used the normalize.shared to have a similar number of sequences per sample. Then, I wanted to put together the samples for each treatment using the merge command with method=median. Because of the different number of samples per treatment, it seems the only option is to use the method=sum. In this way I will have to normalize again the shared file to have the same number of sequences per each treatment. Am I missing something? Should not be easier to be able to use the option "median" independently of the number of sequence per treatment?

Out of curiosity I tested this as shared:

label Group numOtus Otu01 Otu02 Otu03 Otu04 Otu05 0.02 A 5 1 2 3 4 5 0.02 B 5 5 4 3 2 1 0.02 C 3 2 4 2 3 2 0.02 D 3 2 1 1 2 2 0.02 E 4 2 1 3 1 2

and a design file that assign A-B and C-D-E to two treatments. Each treatment has 30 sequences. Still, the command merge with median option reports: [ERROR]: The median and average methods require you to have the same number of sequences in each treatment, quitting.