Normalization of abundances

Hi! These files are raw abundances and thus not normalized. The co-assembly will have additional results with different normalization methods. Regarding the best way of normalizing things, it would depend on what you want to do. Several statistical packages (e.g. DESeq2 for differential abundance analysis) will want you to provide the raw abundances, since they will do their own normalization. For doing ordinations I've been recently exploring gemelli (https://github.com/biocore/gemelli) which will take raw abundances, normalize them with a robust CLR transform and then perform a PCA. For general plotting of abundances I would just use percentages. If working with functions I normally try to use copy numbers (those will not be available when analyzing individual reads). Z-scores could also be valid, particularly for visualization purposes, but it may be tricky to run statistics on them. In general, the topic of data normalization in microbiome analysis is not trivial.

jtamames / SqueezeMeta

Normalization of abundances #859