Thanks a lot for making minfi available to the scientific community! I have used it extensively in my research over the last few years. However, I think I might have come across a mistake in preprocessNoob that occurs for EPIC (v1) samples:
Hence, probe.type does not correspond to the actual probe types of Meth and Unmeth and the following subsequent indexing of them in .preprocessNoob with Green_probes and so on will produce incorrect results:
# NormExp estimates for Green and Red
dat <- list(
Green = list(
M = Meth[Green_probes, , drop = FALSE],
U = Unmeth[Green_probes, , drop = FALSE],
D2 = Meth[d2.probes, , drop = FALSE]),
Red = list(
M = Meth[Red_probes, , drop = FALSE],
U = Unmeth[Red_probes, , drop = FALSE],
D2 = Unmeth[d2.probes, , drop = FALSE]))
The original discrepency in CpG sites occurs because
to create MSet, preprocessRaw uses IlluminaHumanMethylationEPICmanifest (through getProbeInfo and getManifest), which is based on the B2 manifest file from Illumina
to create probe.type, getProbeType relies on IlluminaHumanMethylationEPICanno.ilm10b4.hg19 (through getAnnotation and due to .default.epic.annotation <- "ilm10b4.hg19" in utils.R), which is based on the B4 manifest file from Illumina
and these two versions of the manifest file have a differing number of probes.
For our EPIC sample from above, making a quick-and-dirty b2-based version of probe.info leads to the following results:
Thanks a lot for making minfi available to the scientific community! I have used it extensively in my research over the last few years. However, I think I might have come across a mistake in
preprocessNoob
that occurs for EPIC (v1) samples:After reading such an EPIC sample from GEO
and executing this first code chunk from
preprocessNoob
the number of CpGs in the
Meth
/Unmeth
matrices andprobe.type
are not the same:Hence,
probe.type
does not correspond to the actual probe types ofMeth
andUnmeth
and the following subsequent indexing of them in.preprocessNoob
withGreen_probes
and so on will produce incorrect results:The original discrepency in CpG sites occurs because
MSet
,preprocessRaw
usesIlluminaHumanMethylationEPICmanifest
(through getProbeInfo and getManifest), which is based on the B2 manifest file from Illuminaprobe.type
,getProbeType
relies onIlluminaHumanMethylationEPICanno.ilm10b4.hg19
(through getAnnotation and due to.default.epic.annotation <- "ilm10b4.hg19"
in utils.R), which is based on the B4 manifest file from Illuminaand these two versions of the manifest file have a differing number of probes.
For our EPIC sample from above, making a quick-and-dirty b2-based version of
probe.info
leads to the following results:I'd be curious to hear your thoughts! Am I missing something, or did I setup something incorrectly?
Kind regards, Max