adw96 / DivNet

diversity estimation under ecological networks
83 stars 18 forks source link

Verifying DivNet getting started vignette #50

Closed laracp closed 4 years ago

laracp commented 4 years ago

I am currently working through the DivNet getting started vignette and seem to be getting different p-values for hypothesis testing than those in the tutorial.

estimates <- divnet_phylum_char$shannon %>% summary %$% estimate`
ses <- sqrt(divnet_phylum_char$`shannon-variance`)
X <- breakaway::make_design_matrix(lee_phylum, "char")
betta(estimates, ses, X)$table

Output:

                     Estimates Standard Errors p-values
(Intercept)             1.34699686      0.02031768    0.000
predictorsbiofilm      -0.91477148      0.08798129    0.000
predictorscarbonate    -0.06179152      0.11506678    0.591
predictorsglassy       -0.21181413      0.04741169    0.000
predictorswater         0.11174642      0.10319277    0.279

Based on the vignette there should be no significant differences between altered and glassy (p-value of 0.53) but I am getting a p-value of 0. Not sure where I may be going wrong.

Many thanks!

Lara

adw96 commented 4 years ago

Hi Lara,

Thanks for getting in touch about this! This brings up a number of things that hadn't occurred to me before.

  1. DivNet's model fitting procedure has a Metropolis Hastings step, which involves estimating a complex integral using Monte Carlo estimation, which is inherently random. I should be setting a seed to make sure that all folks who run the vignette get the same results. I just opened issue #51 so I fix this.

  2. When I just ran lines 153-156 of the vignette I get

                      Estimates Standard Errors p-values
    (Intercept)          1.34384455      0.02162775    0.000
    predictorsbiofilm   -0.90682312      0.10446944    0.000
    predictorscarbonate -0.05982077      0.20578705    0.771
    predictorsglassy    -0.21667079      0.04653692    0.000
    predictorswater      0.13325182      0.12691325    0.294

    which is a little different from you, but again inconsistent with the stated p-value of zero in the text. I suspect that what happened was I accidentally read the carbonate line not the glassy line, and therefore there is an error in the text. I think you are getting the correct p-value (which will vary slightly due to Monte Carlo error), but I incorrectly read the results when writing the vignette. I'm going to fix this error in the text and then close this issue.

Many thanks for bringing this to my attention!

Amy

laracp commented 4 years ago

Thanks Amy! This is great.

Based on your above comment, the results from the hypothesis testing make sense with the diversity estimates we generated earlier in the tutorial and also demonstrate how important it is to include variance estimates in our models.

One of many reasons why I am really loving the DivNet, breakaway and corncob packages for analyzing my data. They are wonderful to use and the tutorials are very well done so many thanks for these.

gregpoore commented 4 years ago

@adw96 Is there any way to get the actual p-values rather than the rounded ones? For example, in the results you show, the p-value of predictorsbiofilm is "0.000", which would not be reportable in a paper other than it was significant.

Thanks in advance for the help!

adw96 commented 4 years ago

Hi Greg!

I would report that as "p < 0.001" in a paper, but if you want more precision you can go

2 * (1 - pnorm(abs(estimate/std error)))

eg. 2 * (1 - pnorm(abs(-0.21667079/0.04653692))) to get the standard error on glassy biofilms.

I'll respond to your other question ASAP! (probably tomorrow)l

Amy

gregpoore commented 4 years ago

Hi Amy,

Great! Thank you so much for the "p<0.001" idea and the code that gives the exact value. Some journals have requested exact p-values rather than ranges (even if low), so this is helpful. Thank you in advance for answering that other question too!

Sincerely, Greg