biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
54 stars 25 forks source link

Songbird usage approach #143

Closed mTangherlini closed 3 years ago

mTangherlini commented 3 years ago

Dear all, First of all, I am really fond of the approach developed by Songbird for differential abundance analysis: it would be perfect for my needs. I have a couple of questions about its correct usage, though:

  1. I have a bunch of samples which I need to compare for my research, but there's no real "reference" group of samples. How should I apply Songbird in this case? Should I run it multiple times setting each time a different group as reference?
  2. Are the differential rankings tested for FDR? One of the approaches I found useful for re-test the significance of the differences was the following: select a % of the differentially-abundant features using Qurro, recover the differential rankings associated with these features, calculate the CLR between numerator and denominator features for each category in my model and test for significance with an ANOVA/PERMANOVA test elsewhere. Would it be correct?

Best regards

mortonjt commented 3 years ago

Hi @mTangherlini , these are all good questions.

  1. You mean reference as in there is no control group? In that case, it doesn't really matter which group you choose as a reference -- the only thing is that the choice of reference will influence the interpretation of the coefficients.
  2. This one is tricky. One of the takeaways from the reference frames paper is that there is not a well-defined null-hypothesis on a per-microbe level. Without a reasonable null-hypothesis, it doesn't make sense to talk about p-values. I typically recommend to only utilize differential abundance analysis if there is a significant difference in beta diversity (which has a better defined null hypothesis). If beta diversity is significant, then you know that at least 1 microbe is differential across your groups - in which rankings visualized in qurro can help you identify the useful microbes. So yes, the procedure you described is mostly fine as long as you have a strong separation in your beta diversity analysis - but you would need to compute the CLR of the numerator and denominator and take the log ratio before computing statistical tests such as t-tests / ANOVA.

These sort of questions are best posted on the forums - since this issue tracker is reserved for bugs. CC me there and I will be happy to follow up.