mathiaskalxdorf / IceR

Quantitative proteomics workflow
https://mathiaskalxdorf.github.io/IceR/
14 stars 4 forks source link

Question about imputation #13

Closed quark-spec closed 2 years ago

quark-spec commented 3 years ago

Dear developers,

Not sure if this is the right place to ask, but I would like to learn more about the imputation algorithm IceR uses. I saw some proteins are untouched after imputation (same as before), so I wonder if there are criteria for selecting proteins to be imputed. In the past, I have tried Perseus with MaxQuant data, and I think Perseus downshifts the distribution to obtain a lower mean and SD, and then impute the missing intensities. I think IceR's imputation method is more complex and it would be great if you could explain a bit more. Sorry if my question is not fit for asking here. Thank you.

mathiaskalxdorf commented 2 years ago

Hey,

the details about missing value imputation can be found in the publication. (https://doi.org/10.1038/s41467-021-25077-6)

In brief, IceR estimates background noise for every individual sample across the LC gradient using decoy features (features arbitrarily shifted in RT and m/z space). Using the observed ions which are by chance falling into the quantification windows of these decoy features and by modeling this data using generalized additive models (GAMs), IceR can predict the expected random noise signal in case of features with complete missing intensity. There are some requirements which have to be fulfilled before missing values are imputed e.g. there should not be a real peak detected in the expected quantification window. In the majority of originally missing values in the MaxQuant results there are real peaks detected by IceR in the raw spectra and this information is used for reporting quantifications. However, if some criteria are not met (e.g. uncertainty that the right peak was selected), IceR will exclude some quantifications and in these situations will also not use Imputation which explains why some missing values (compared to MaxQuant) are not getting imputed by IceR.

I hope this gives some explanation what is going on

Best,

Mathias