zwdzwd / sesame

🍪 SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Other
63 stars 33 forks source link

bisConversionControl for EPICv2 #103

Open lissettegomez opened 1 year ago

lissettegomez commented 1 year ago

Hi,

I'm using sesame to process EPICv2 data, but I get the error below when I try to run the bisConversionControl() function. Any suggestion? Thanks!

bis<-bisConversionControl(sdf) Error in inferPlatformFromProbeIDs(sdf$Probe_ID, silent = !verbose) : Ambiguous platform. Please provide platform explicitly.

SteffanChristiansen commented 1 year ago

Did you find a solution for your issue? I have the same problem when I am running the bisConversionControl() (sesame version 1.19.5) with EPIC v2 data but not when I use the provided EPIC data:

sdf <- sesameDataGet('EPIC.1.SigDF') bisConversionControl(sdf) 1.07

Best, Steffan

zwdzwd commented 1 year ago

The bisConversionControl function has now been updated to take array manifest of generic form.

mft = sesameAnno_buildManifestGRanges(
    sesameAnno_download("EPICv2.hg38.manifest.tsv.gz"),
    columns = "nextBase")
extR = names(mft)[!is.na(mft$nextBase) & mft$nextBase=="R"]
extA = names(mft)[!is.na(mft$nextBase) & mft$nextBase=="A"]
bisConversionControl(sdf, extR, extA)

More information can be found at https://zhou-lab.github.io/sesame/dev/supplemental.html#Bisulfite_conversion Hope this works.

SteffanChristiansen commented 1 year ago

Thanks a lot for the update and the suggestion. When I use your solution, I am only able to successfully use bisConversionControl() when I activively state that the platform is "EPIC":

sdf <- openSesame(searchIDATprefixes("data"), manifest = addr, func = NULL, prep = "", platform = "EPIC") str(sdf[[1]])

Classes ‘SigDF’ and 'data.frame': 937690 obs. of 7 variables: $ Probe_ID: chr "cg00000029_TC21" "cg00000109_TC21" "cg00000155_BC21" "cg00000158_BC21" ... $ MG : int NA NA NA NA NA NA NA NA NA NA ... $ MR : int NA NA NA NA NA NA NA NA NA NA ... $ UG : int 758 2520 5304 5222 617 2780 3228 1249 6473 2951 ... $ UR : int 3625 737 678 492 2833 1522 971 311 3220 6828 ... $ col : Factor w/ 3 levels "G","R","2": 3 3 3 3 3 3 3 3 3 3 ... $ mask : logi FALSE FALSE FALSE FALSE FALSE FALSE ...

bisConversionControl(sdf[[1]], extR, extA) [1] 1.04

However, when the platform is inferred to be EPICv2, bisConversionControl() fails:

sdf2 <- openSesame(searchIDATprefixes("data"), manifest = addr, func = NULL, prep = "", platform = "") str(sdf2[[1]]) Classes ‘SigDF’ and 'data.frame': 937690 obs. of 7 variables: $ Probe_ID: chr "cg00000029_TC21" "cg00000109_TC21" "cg00000155_BC21" "cg00000158_BC21" ... $ MG : int NA NA NA NA NA NA NA NA NA NA ... $ MR : int NA NA NA NA NA NA NA NA NA NA ... $ UG : int 758 2520 5304 5222 617 2780 3228 1249 6473 2951 ... $ UR : int 3625 737 678 492 2833 1522 971 311 3220 6828 ... $ col : Factor w/ 3 levels "G","R","2": 3 3 3 3 3 3 3 3 3 3 ... $ mask : logi FALSE FALSE FALSE FALSE FALSE FALSE ...

bisConversionControl(sdf2[[1]], extR, extA) Error in bisConversionControl(sdf2[[1]], extR, extA) : platform %in% c("EPICplus", "EPIC", "HM450") is not TRUE