fbreitwieser / isobar

isobar - R library for the Analysis and quantitation of isobarically tagged MSMS proteomics data
https://bioconductor.org/packages/release/bioc/html/isobar.html
9 stars 8 forks source link

Multi-sample P(X>=Y) calculation #11

Closed alyst closed 9 years ago

alyst commented 9 years ago

Let X_1, X_2,... X_n be some arbitrary distributions (distr objects) with the mode close to zero. Let Y_1, Y_2,..., Y_m be normal distributions with known means and variances. This PR implements the calculation of P(X' >= Y'), where the random variable X' has the PDF pdf[X'](t) ~ pdf[X_1](t)*pdf[X_2](t)*...pdf[X_n](t) (i.e. the joint distribution of X_is induced to the diagonal), and, similarlty, the PDF of Y' is proportional to pdf[Y_1](t)*pdf[Y_2](t)*...pdf[Y_m](t).

It's implemented in calcCumulativeProbXGreaterThanY() and uses the approach similar to calcProbXGreaterThanY(). In particular, we use the fact that the induction of joint distribution of normal distributions to the diagonal would still be a normal distribution with the explicit formulas for the variance and mean, so we know the CDF Y'.

This routine could be used to test the significance of ratios when several replicate experiments are available. Let Xs be the background distributions of ratios and Ys be the estimates of ratios in each replicate experiment. In this case Y' would be the distribution for the "true" ratio (that is the same in all biological replicates), and X' is the cumulative background distribution of ratios (how the "true" background noise ratio is distributed).