A question about info score

Hi,

Sorry for my slow reply, I've been involved with undergrad interviews here at Oxford the last few days, which has been all encompassing.

Feel free to let me know a bit more about your project, and the parameters you used to do the imputation, so that I can comment if some changes might be beneficial and potentially increase the average INFO score.

The INFO score used here is a standard one used in imputation, which can be read about, for instance here https://www.well.ox.ac.uk/~gav/snptest/#info_measures Informally, it is closer to 1 if the imputation process is confident, and closer to 0 if it is less confident In slightly more detail, confidence comes from the distribution of genotype posteriors. If the genotype posteriors are fully confident, i.e. always 0 or 1, then the INFO score should be close to 1. If the genotype posteriors are not confident, i.e. close to 1/3, the INFO score should be close to 0.

Now, generally, STITCH is very well calibrated, so the INFO score at a variant should be monotonically related to the expected imputation accuracy. Ideally you'd have some truth data set, that would allow you to compare how an INFO score threshold correlates with accuracy. In the past, I and others have found 0.4 seems to be a reasonable threshold, so that's why I suggest it. If you have some other way to measure accuracy, or your own truth data, you might find a different threshold to be more reasonable.

Best, Robbie

rwdavies / STITCH

A question about info score #75