Investigate the usefulness of GLM standard errors

See this StackOverflow question, in particular the answer by atiretoo which computes the SE in R. A more detailed description of these standard errors can be found here. This is for a binomial GLM with no constraints, which of course is not the case for our model.

Are the standard errors similar to the estimates we get from bootstrapping?
- If so, this reduces the need for bootstrapping.
Is the sampling distribution actually normal on the link scale?
- I suspect not, since the parameters are constrained to sum-to-less-than-one on the response scale.

Steps to investigate:

Simulate a bunch of data and manually estimate the sampling distribution.
- Do this for a couple different variants - some with a lot of shared mutations (highly multicollinear) and some with only a few.
Calculate the SE from $(X^TWX)^{-1}$ for a single model, compare to manual SE.
Calculate SE from bootstrapping for a single model, compare to manual SE.

This analysis could be a vignette to demonstrate just how important it is to properly specify the variants.

DASL-Lab / provoc

Investigate the usefulness of GLM standard errors #16