Many ASVs with a high absolute mu but insignificant

roey-angel commented 3 years ago

In a recent analysis with corncob, I'm getting many ASVs modelled with a very low or high mu but with an insignificant P-value while my significant ASVs have relatively low mu. I run corncob as follows: differentialTest(formula = ~ get(var2test), phi.formula = ~ get(var2test), formula_null = ~ 1, phi.formula_null = ~ get(var2test), test = "Wald", boot = FALSE, data = ps_obj, fdr_cutoff = 0.05, full_output = TRUE) And I end up with something like that: corncob

(I double-checked that this isn't some bug in the plotting code)

I'm sorry I don't have a minimal example to reproduce but have you encountered something like that or can think of where it comes from?

bryandmartin commented 3 years ago

Hi Roey,

Lot's of possibilities here. My first question: is there a reason you expect more highly abundant taxa to be significant? Differential abundance is testing for a meaningful difference, whether that be a meaningful difference in large absolute values or small absolute values. In fact, corncob is specifically designed to work with both highly-abundant and not highly-abundant taxa.

All this being said, if there's a specific reason you expect the highly-abundant taxa to be more differentially abundant in your application area, then perhaps this is something we can unpack more. As a next step I would recommend choosing a few of the differentially abundant taxa that don't make sense to you, and plotting the bbdml outputs for those taxa. It will give you a clearer picture of what patterns are going on in the data that corncob is recognizing.

Bryan

roey-angel commented 3 years ago

Hi Bryan, To clarify, I don't expect more ASVs to be significant for any particular reason I was mostly simply concerned with the existence of this "cloud" of highly DA ASVs, all of which being non-significant. Digging a little into this and plotting some of these non-significant ASVs as you suggested it appears that in all cases these are ASVs that appear in one treatment but are completely absent in all the samples of the other treatment. Would that be an expected behaviour of the model? (i.e. giving a high mu to such cases)

bryandmartin commented 3 years ago

Thanks for clarifying @roey-angel ! I was misinterpreting your question entirely, apologies for that.

Your follow up from digging into it would have been my first question. If the samples are absent in all of the samples in a treatment group, certain hypothesis testing procedures become unreliable. I'd recommend switching to test = "LRT" rather than the default "Wald" in this setting. With completely absent taxa, the MLE of the parameter you are estimating lies on the boundary of the parameter space, invalidating Wald testing. LRT is still valid in this setting, though just because something is missing doesn't mean it will identify it as differentially abundant. To speak very broadly, there has to be enough information in the data that the model is "sure enough" that the absence is from differential abundance in the population, rather than just undersampling or something like that. For example, if you had 2 counts in one sample in one group, and all 0's in another, the model likely would not identify this as differentially abundant. (Also, there is hardly enough data there to even justify hypothesis testing, but that's besides the point.)

Does this all make sense? Sorry for misunderstanding before, but this result doesn't surprise me given the prevalence of 0's either.

roey-angel commented 3 years ago

Thanks for the clarification, @bryandmartin. I misunderstood the line in your paper "...that a taxon is unobserved in certain experimental conditions, the default behaviour for our software in this setting is to return a test statistic of zero for Wald-type tests to indicate that inference is uninformative" and assumed that the statistic itself (and not just the test) will become zero.

It's also clear that a major challenge of such methods is to handle taxa with zero abundance.

However, trying my code with LRT instead, I get many fewer ASVs as significant and also none of those I complained about became significant...

bryandmartin commented 3 years ago

I agree, handling zero abundance is a major challenge, and something we thought about quite a lot when developing corncob.

I'm not entirely sure what you mean by the test becoming zero, but right now the software does what I still think it should in this setting: acknowledge that there isn't enough data here to construct a mathematically valid hypothesis test, and conclude that there is not sufficient evidence to reject any null hypothesis.

I'm a bit confused about your last sentence but happy to follow up! Are you saying that you expected it to keep the ASVs that were already significant, as well as to add more that you expect to be significant? What are the ASVs you expect to be significant that are not? If it's ASVs with zero-counts in one group, I'm not totally surprised that it didn't reject the null.

It might be helpful to see a plot of the bbdml() output for one of your taxa of interest. Whenever the model results don't match my scientific expectations, individually examining the results usually helps clarify for me what the model is capturing.

Does this make sense?

Midnighter commented 3 years ago

The following method has the concept of "structural zeros" for this situation. I'd actually be very curious to hear what you think about ANCOM-BC in comparison to corncob but that's probably more for an email conversation.

Lin, Huang, and Shyamal Das Peddada. 2020. “Analysis of Compositions of Microbiomes with Bias Correction.” Nature Communications 11 (1): 3514. https://doi.org/10.1038/s41467-020-17041-7.

roey-angel commented 3 years ago

Apologies for keeping quiet but I was working on a new grant proposal.

I'm not entirely sure what you mean by the test becoming zero, but right now the software does what I still think it should in this setting: acknowledge that there isn't enough data here to construct a mathematically valid hypothesis test, and conclude that there is not sufficient evidence to reject any null hypothesis.

I'm a bit confused about your last sentence but happy to follow up! Are you saying that you expected it to keep the ASVs that were already significant, as well as to add more that you expect to be significant? What are the ASVs you expect to be significant that are not? If it's ASVs with zero-counts in one group, I'm not totally surprised that it didn't reject the null.

Is there a step in which these ASVs (that appear in only one treatment) are being singled out prior to downstream calculations? If yes, I think would be good to also set mu to 0 (or NA) to indicate that one cannot make a reliable estimate. Referring to my figure above I think it makes sense to remove all points >10 and <-10, because otherwise, it's unclear why should ASVs with such a dramatic change in abundance be insignificant.

Would you agree?

bryandmartin commented 3 years ago

Hi @roey-angel ,

Yes, those ASVs are being singled out. They will be stored in either/both of discriminant_taxa_DA and discriminant_taxa_DV. It should also be printing out a warning for those taa if you choose to keep them. It's a bit tough to say what is "best" in this instance. In some sense, we have a perfectly valid estimate. However, because there is so little data, the standard error for this estimate is very large. Thus, even though there is a dramatic change in the abundance, it isn't a statistically significant change.

This all being said, if you want to remove those points entirely, that is also an option built into corncob with the filter_discriminant parameter of differentialTest().

roey-angel commented 3 years ago

Thanks, that's very helpful!

statdivlab / corncob

Many ASVs with a high absolute mu but insignificant #107