joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
586 stars 186 forks source link

Normalization for alpha diversity analyses #1687

Open otaviolovison opened 1 year ago

otaviolovison commented 1 year ago

Hello!

While reading several papers about microbiome analysis, I have found a regular recommendation for normalization prior to analyses. But this sounds strange to me for alpha diversity, since estimate_richness does not accept decimal values, only integers, and every normalization I tried generated decimals (as expected). Since I have an important batch effect on my data (I performed batch effect treatment, which also requires previous normalization) with important differences in library sizes, I would like to understand which normalization strategy I should use for alpha diversity, since the only normalization that does not generate decimals is rarefy_even_depth (which I know it is no longer recommended).

Thanks in advance for your attention.

ecastron commented 1 year ago

Hi Otavio,

I just came across your message. I guess you already solved this, but you can normalize with DEseq, for instance, and then simply round the normalized read counts to get integers. Rounding won't affect the overall shape of the taxa distributions as it's a very minor thing to do.

Cheers,

Eduardo

otaviolovison commented 1 year ago

Thanks!

MSc. Otávio von Ameln Lovison CRF/RS 12363 Farmacêutico bioquímico Especialista em Citologia Clínica Especialista em Microbiologia Clínica Mestre em Ciências Farmacêuticas (CAPES 7) pela Universidade Federal do Rio Grande do Sul (PPGCF/UFRGS) *Doutorando *em Ciências Farmacêuticas (CAPES 7) pela Universidade Federal do Rio Grande do Sul (PPGCF/UFRGS) Instituto Nacional de Pesquisa em Resistência Antimicrobiana - INPRA

Laboratório de Pesquisa em Resistência Bacteriana - LABRESIS Laboratório de Microbiologia e Saúde Única - ICBS/UFRGS Núcleo de Bioinformática (Bioinformatics Core) do Hospital de Clínicas de Porto Alegre

Em dom., 5 de nov. de 2023 às 19:56, Eduardo @.***> escreveu:

Hi Otavio,

I just came across your message. I guess you already solved this, but you can normalize with DEseq, for instance, and then simply round the normalized read counts to get integers. Rounding won't affect the overall shape of the taxa distributions as it's a very minor thing to do.

Cheers,

Eduardo

— Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/1687#issuecomment-1793872806, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL3LUXUMHRMEUK7KCKI3VQTYDAKRFAVCNFSM6AAAAAAZWA2H7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTHA3TEOBQGY . You are receiving this because you authored the thread.Message ID: @.***>