borenstein-lab / fishtaco

FishTaco (Functional Shifts Taxonomic Contributors) is a metagenomic computational framework that aims to identify the driver taxa of microbiome functional shifts
Other
23 stars 4 forks source link

Data normalization #5

Open davidoctaviobotero opened 4 years ago

davidoctaviobotero commented 4 years ago

Hi,

Thank you for an excellent approach to functional metagenomics. I have a major concern with the normalization method based on relative abundance. It has been demonstrated that is one of the worst approaches. Can you suggest other normalization methods compatible with MUSSic and FishTaco? For differential abundance analysis I usually use Log Center Ratio (CoDa, Gloor, 2017) or Variation stabilization (DESeq2), but neither of them give values in the range of 0 and 1.

Thank you,

engal commented 4 years ago

Hi,

Glad to hear you're interested in MUSiCC and FishTaco! I'm not entirely sure I understand your question, as MUSiCC itself is a normalization method for metagenomic functional profiles. MUSiCC uses the relative abundance of universal single-copy genes to convert functional profiles from relative abundance to units of average copy number per genome. This means that the data are no longer compositional and thus better suited for analyses such as differential abundance testing.

I should also note that FishTaco, by default, incorporates MUSiCC as part of its data processing, so any data you analyze with FishTaco will be corrected via MUSiCC.

Hope that helps!

davidoctaviobotero commented 4 years ago

Got it, you are right! MUSSiC made a great job of the normalization of functional profiles. Now, what about the abundance file of OTUs that is used in FishTaco? May I just calculate the relative abundance of it, or can you suggest other normalization method suitable for FishTaco?

Thank you again,

David Octavio Botero Rozo, PhD

Cellular: +57 315 6490601

El mar., 22 de oct. de 2019 a la(s) 15:05, engal (notifications@github.com) escribió:

Hi,

Glad to hear you're interested in MUSiCC and FishTaco! I'm not entirely sure I understand your question, as MUSiCC itself is a normalization method for metagenomic functional profiles. MUSiCC uses the relative abundance of universal single-copy genes to convert functional profiles from relative abundance to units of average copy number per genome. This means that the data are no longer compositional and thus better suited for analyses such as differential abundance testing.

I should also note that FishTaco, by default, incorporates MUSiCC as part of its data processing, so any data you analyze with FishTaco will be corrected via MUSiCC.

Hope that helps!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/borenstein-lab/fishtaco/issues/5?email_source=notifications&email_token=AAQR5XKVAFFEOMRLVCHPX5TQP5MIPA5CNFSM4JDPBU2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB7APJA#issuecomment-545130404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQR5XMXRUQ7KOCGFJLGLSLQP5MIPANCNFSM4JDPBU2A .

engal commented 4 years ago

Oh, I see now. Sorry for my confusion, I did misunderstand your question. FishTaco does assume that you'll be providing relative abundance taxonomic data.