hansenlab / minfi

Devel repository for minfi
58 stars 68 forks source link

markers removed from EPIC DMAP files causing fewer CpGs in idat files and minfi errors #107

Closed ekarlins closed 7 years ago

ekarlins commented 7 years ago

Illumina has decided to remove a small number of probes from the DMAP files for the EPIC array.

They also have a new manifest to correspond to their new DMAP files.

Currently, however, the idats contain fewer probes and minfi does not have a manifest to associate with these data.

require(minfi)
RGset <- read.metharray.exp("/ScanData/201220980007", extended = T)
getManifest(RGset)
##Loading required package: Unknownmanifest
##Error in getManifest(RGset) : 
##  cannot load manifest package Unknownmanifest
##In addition: Warning message:
##In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
##  there is no package called ‘Unknownmanifest’
RGset
##RGChannelSetExtended (storageMode: lockedEnvironment)
##assayData: 1051943 features, 8 samples 
##  element names: Green, GreenSD, NBeads, Red, RedSD 
##An object of class 'AnnotatedDataFrame': none
##Annotation
##  array: Unknown
##  annotation: Unknown

Is there a way to add this new manifest and annotate the RGSet appropriately so it will work with other functions?

Thanks! Eric

kasperdanielhansen commented 7 years ago

Thanks, we are aware and an updated annotation package has already been submitted to Bioconductor (and may be online as I write).

This issue is technically not caused by the manifest or annotation package (this is not easy to know). It should be fixed in minfi devel (1.21.6 or greater), which is available from Bioconductor devel or you can wait a few days since the next release is tomorrow.

ekarlins commented 7 years ago

@kasperdanielhansen Awesome! Thanks! Do I need to do anything special to annotate or will minfi know which annotation package to use?

kasperdanielhansen commented 7 years ago

The default right now is to use the "previous" annotation package b2. If you want to switch to the latest, you need to do it manually. I might release a fix in a few days for this.

ekarlins commented 7 years ago

Thanks @kasperdanielhansen ! Will it be possible to combine the "previous" version of EPIC and the "new" version into one RGSet? Since they have a different number of probes, how will this work?

Is there any difference in the b2 manifest and the b3, other than the ~900 probes that are removed from b3? So, for minfi, is there any reason to annotate with b3 instead of b2?

Thanks! Eric

ekarlins commented 7 years ago

We just got the new version of R (3.4.0) the new version of Bioconductor and a new version of minfi (1.22.0), so I was able to test my above question about combining EPIC data from old and new arrays (with and without the ~900 probes). It looks like this fails with default settings but works if you use force=TRUE with read.metharray.exp. And it seems to only keep the intersecting CpGs, so excludes the ~900 probes. This seems great to me! Thanks for getting this working so quickly!!

rgset <- read.metharray.exp("/path/to/ScanData/multipleChips", recursive = T)
##Error in read.metharray(basenames = commonFiles, extended = extended,  : 
##  [read.metharray] Trying to parse IDAT files with different array size but seemingly all of the same type.
##  You can force this by 'force=TRUE', see the man page ?read.metharray

rgset <- read.metharray.exp("/path/to/ScanData/multipleChips", recursive = T, force=TRUE)
rgset
##class: RGChannelSet 
##dim: 1051943 7 
##metadata(0):
##assays(2): Green Red
##rownames(1051943): 1600101 1600111 ... 99810990 99810992
##rowData names(0):
##colnames(7): 200930750040_R07C01 200930750040_R08C01 ...
##  201220980007_R04C01 201220980007_R05C01
##colData names(0):
##Annotation
##  array: IlluminaHumanMethylationEPIC
##  annotation: ilm10b2.hg19
kasperdanielhansen commented 7 years ago

Thanks for testing. Two comments 1) We have the newer illumina annotation available as IlluminaHumanMethylationEPIC.ilm10b3.hg19 2) We have released minfi 1.22.1 which fixes a bug when using read.metharray(..., extended=TRUE)