Open JoWhi opened 1 year ago
This method for imputing missing Horvath 40k methylation data has only been done with Mouse, where the lab has a "gold standard" matrix with filler values for each CpG site, based on mean methylation across all of our mice data. However, this situation only arises when the user generated data was generated using a different array, like the standard 320k array or the EPIC array. Like in my previous comment, I would ask if you used the Horvath 40k array to generate your data?
My data is from the mammalian array; many sites have NAs in the raw data, and many more sites are dropped after cleaning & normalizing. Is it not possible to get estimates for clocks with missing data with the predictAge function, other than for mouse?
As of now, no, the clocks are built with complete data, but you may try replacing NAs with 0.5 after normalization is done, because I believe that is what Horvath has done in some cases.
Is there a validated method for handling missing data points with the universal clocks? My beta matrix has NAs for some or all individuals at many sites needed for the clocks. Should these be replaced with a tiny beta value, or 0.5, or ?? It is possible to impute betas for sites at which some individuals have values, but not at sites with no data. Thanks for suggestions!