bio-learn / biolearn

Machine learning tools for biomarker analysis
Other
44 stars 13 forks source link

impute from standard doesn't impute (add) cpgs which are missing from the dataset #86

Open moqri opened 1 month ago

moqri commented 1 month ago

if a CpG does not include in a dataset, the impute standard will not add it to the data. this will cause an issue using the deconvolution method since those missing CpGs will make the results very biased. Example with a few missing CpGs:

image image
sarudak commented 1 month ago

I agree that impute_from_standard should do this. For now Hybrid impute will do this. If you want to mimic the behavior of impute_from_average you can set the threshold to 0 and it will always impute from the provided source.