Not entirely clear what the recommended use is

teyden commented 5 years ago

Hi,

Thanks for creating this package. I'm currently seeking to apply it to my microbiome project. I'm a bit confused about the recommended use as there are functions within the separate R scripts as well as within the mbzinb package that seem to do the same/similar things (run the differential analysis test).

The R scripts I'm referring to are:

And the package:

https://github.com/jchen1981/MicrobiomeDDA/blob/master/mbzinb_0.2.tar.gz

I've installed the mbzinb package to run the differential abundance analysis for two group comparison using mbzinb.dataset() to set up the data, then mbzinb.test() to run the actual test. I'm interested in GMPR normalization and noticed that the ZISeq() method can also run a statistical test while GMPR normalizing the data beforehand. This function is available in an R script outside of the mbzinb package (https://github.com/jchen1981/MicrobiomeDDA/blob/master/zeroinfl.plus.daa.R).

So in summary, which workflow should I use? What are the differences ZISeq() and mbzinb.test()?

Thanks.

jchen1981 commented 5 years ago

Hi Teyden,

The "mbzinb" package implements the omnibus test (i.e., joint test of abundance, prevalence and dispersion) for two-sample comparison as in your case. It also uses the GMPR normalization. But it could not adjust covariates such as sex and age. Sometimes these covariates may confound the two-sample comparison (e.g., cases are much older than controls). In such cases, you need to adjust these confounding variables. ZISeq offers the flexibility to adjust covariates. Moreover, it allows you to select tests other than the omnibus test . For example, if you think the dispersion may not be of interest, you can choose to test the prevalence and abundance only but not the dispersion parameter. This can be done by specifying "method = 'prev.abund1'".

Best, Jun

teyden commented 5 years ago

Hi Jun,

Thank you for the clarification. I had noticed in looking at the code that ZISeq() allows for covariate adjustment but not mbzinb.test()r. I have since tried adjusting for covariates and would get less significant hits returned as result, which is expected, however the results have many NA values for specific data columns such as "chi.stat" and "abund.LFC", though there are p-values for it. An example screenshot is attached (I added the adjusted p-value column separately). Could you provide me some insight on why this happens?

[image: Screen Shot 2019-03-29 at 11.12.17 AM.png],

Thanks, Teyden

On Thu, Mar 28, 2019 at 1:09 PM Chen_LAB notifications@github.com wrote:

Hi Teyden,

The "mbzinb" package implements the omnibus test (i.e., joint test of abundance, prevalence and dispersion) for two-sample comparison as in your case. It also uses the GMPR normalization. But it could not adjust covariates such as sex and age. Sometimes these covariates may confound the two-sample comparison (e.g., cases are much older than controls). In such cases, you need to adjust these confounding variables. ZISeq offers the flexibility to adjust covariates. Moreover, it allows you to select tests other than the omnibus test . For example, if you think the dispersion may not be of interest, you can choose to test the prevalence and abundance only but not the dispersion parameter. This can be done by specifying "method = 'prev.abund1'".

Best, Jun

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jchen1981/MicrobiomeDDA/issues/1#issuecomment-477752283, or mute the thread https://github.com/notifications/unsubscribe-auth/AG3aujilK04CyAARi1ZknR52q7QWjEGXks5vbSF3gaJpZM4cMMQW .

jchen1981 commented 5 years ago

For some taxa that do not meet the distributional assumption of our model, the algorithm may fail to converge. In such cases, the algorithm will output NA. We provide an option to assess significance of these problematic taxa through a permutation test based on linear model. So we could still get a p-value but not the coefficient estimates.

Jun

Sent from my iPhone

On Mar 29, 2019, at 1:15 PM, Teyden Nguyen notifications@github.com wrote:

Hi Jun,

Thank you for the clarification. I had noticed in looking at the code that ZISeq() allows for covariate adjustment but not mbzinb.test()r. I have since tried adjusting for covariates and would get less significant hits returned as result, which is expected, however the results have many NA values for specific data columns such as "chi.stat" and "abund.LFC", though there are p-values for it. An example screenshot is attached (I added the adjusted p-value column separately). Could you provide me some insight on why this happens?

[image: Screen Shot 2019-03-29 at 11.12.17 AM.png],

Thanks, Teyden

On Thu, Mar 28, 2019 at 1:09 PM Chen_LAB notifications@github.com wrote:

Hi Teyden,

The "mbzinb" package implements the omnibus test (i.e., joint test of abundance, prevalence and dispersion) for two-sample comparison as in your case. It also uses the GMPR normalization. But it could not adjust covariates such as sex and age. Sometimes these covariates may confound the two-sample comparison (e.g., cases are much older than controls). In such cases, you need to adjust these confounding variables. ZISeq offers the flexibility to adjust covariates. Moreover, it allows you to select tests other than the omnibus test . For example, if you think the dispersion may not be of interest, you can choose to test the prevalence and abundance only but not the dispersion parameter. This can be done by specifying "method = 'prev.abund1'".

Best, Jun

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jchen1981/MicrobiomeDDA/issues/1#issuecomment-477752283, or mute the thread https://github.com/notifications/unsubscribe-auth/AG3aujilK04CyAARi1ZknR52q7QWjEGXks5vbSF3gaJpZM4cMMQW .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

jchen1981 / MicrobiomeDDA

Not entirely clear what the recommended use is #1