Open FischerJoBio opened 1 year ago
@FischerJoBio This makes sense to me, with two caveats:
What do you think about these?
@katehoffshutta Both good points that fall short in the original as well as suggested fix.
My suggestion for 1) would be to catch this and replace NaN by NA, which still is consistent in terms of meaning and should be good for downstream processing. For 2) maybe we should introduce a mincoverage parameter, to avoid these unrobust predictions if the user does not want them (default to 1) and additionally add a verbose option for raising the warnings. My gut feeling is that these are too many to be meaningful, but it gives the user the option to analyze.
As soon as there are individual probes mapped to a gene that have NAs in the beta file,
probeToMeanPromoterMethylation
generates NA mean values for the entire gene.I would suggest to replace
summarise_at(colnames(mappedBetasLong)[3:(ncol(mappedBetasLong))], mean)
bysummarise_at(colnames(mappedBetasLong)[3:(ncol(mappedBetasLong))], mean, na.rm=T)
which solves this issue by ignoring individual probes that give NA during the computation.