Open paroussisc opened 5 years ago
Cumulative distribution functions
Continuous F(X)
have a uniform distribution on [0,1]
, which is useful to remember for generating rvs - generate a uniform rv and apply the inverse of the CDF on this number.
Linear transformations of normal random vectors
Which seems obvious given the identities in #3 but the point here is that the multivariate normality is retained after transformation. A special case that is that if a
is a vector of finite real constants, then
and when a
is a vector of all zeros except with one element equal to 1, we are back to the univariate case, so:
If X
has a multivariate normal distribution, then the marginal distribution of any X_j
is univariate normal (not the case for a multivariate t-distribution, for example). In fact, the marginal density of any subvector of X
is multivariate normal.
Transformation of random variables CDF:
PDF:
The book, as an example, uses the definition of a multivariate normal to obtain the pdf using the above formula, here we've created an example in #4.
The useful elements of the "Moment generating functions" section are the three properties that are listed:
and these identities are useful when proving the Central Limit Theorem.
See #5 for an example of the CLT in action.
While this is used in the proof of the (weak) law of large numbers, it does have some other uses, mainly when we cannot make any distributional assumptions about the data. It states that a minimum of just 75% of values must lie within two standard deviations of the mean and 89% within three standard deviations.
Generally speaking you can sub in X-mu
to get bounds on the variance, given probabilities, or vice versa.
The main use I've seen for this inequality is for deriving certain identities later in the book for the log-likelihood (log is a concave function). It seems that it is used in many fields of mathematics, but another statistical application is in proving that KL divergence is always non-negative (https://math.stackexchange.com/questions/2031062/proof-of-nonnegativity-of-kl-divergence-using-jensens-inequality).
For convex functions, the inequality flips.
A sufficient statistic for a parameter provides all the information you need to estimate that parameter, e.g. the sample mean is sufficient for estimates of the true mean - no need to keep track of all individual elements of the sample.
Just an issue to keep track of important results and definitions, and my thoughts on these.