Open kasperdanielhansen opened 7 years ago
for posterity. it need a DataFrame not data.frame
What does that mean apastore? What do I need to change from data.frame to a DataFrame? And if you can, what's the difference between the two?
Edit: I figured it out. You have to do pD<-as(pD,"DataFrame")
A data.frame is a classic base R object. A DataFrame is a Bioconductor extension. The main difference is that columns of a DataFrame can be only S4 object with length. A main use case is IRanges and/or GRanges.
On Sat, Jul 8, 2017 at 12:53 AM, chrislun16 notifications@github.com wrote:
What does that mean apastore? What do I need to change from data.frame to a DataFrame? And if you can, what's the difference between the two?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kasperdanielhansen/minfi/issues/108#issuecomment-313811262, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhn1gGfHXWqsOeL9-NCPeAhHvD-OTbks5sLrbkgaJpZM4NLhjq .
Do you know if the guide is still correct, because I seem to be getting incorrect beta values.
I was following the guide up until the preprocessing part. Meaning that I was able to obtain the RGset. Then I used my own preprocessing code to obtain the beta values. However, I cross checked them with the beta values that were on GEO and they weren't consistent.
For example, the accession number I used was GSE68777. So I went to that study on GEO and clicked on the first sample: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1681154. Then I scrolled down and clicked "Download full table" to download the samples and beta values in a text file.
Then I typed head(beta) and chose the first sample. Then I did command F in the text file for that sample and it's value there wasn't the same as the value from the beta data table. Hopefully you can help find the error.
Here is the code I'm using:
library(GEOquery)
library(minfi)
library("IlluminaHumanMethylation450kanno.ilmn12.hg19")
library("IlluminaHumanMethylation450kmanifest")
######## Code copied from 450k Guide ########
getGEOSuppFiles("GSE68777")
untar("GSE68777/GSE68777_RAW.tar", exdir = "GSE68777/idat")
head(list.files("GSE68777/idat", pattern = "idat"))
idatFiles <- list.files("GSE68777/idat", pattern = "idat.gz$", full = TRUE)
sapply(idatFiles, gunzip, overwrite = TRUE)
rgSet <- read.metharray.exp("GSE68777/idat")
geoMat <- getGEO("GSE68777")
pD.all <- pData(geoMat[[1]])
pD <- pD.all[, c("title", "geo_accession", "characteristics_ch1.1", "characteristics_ch1.2")]
names(pD)[c(3,4)] <- c("group", "sex")
pD$group <- sub("^diagnosis: ", "", pD$group)
pD$sex <- sub("^Sex: ", "", pD$sex)
sampleNames(rgSet) <- sub(".*_5", "5", sampleNames(rgSet))
rownames(pD) <- pD$title
pD <- pD[sampleNames(rgSet),]
pData(rgSet) <- pD
It's not clear to me if those GEO sample submissions are processed or unprocessed beta values. Note that you did a normalization step:
MSet <- preprocessIllumina(rgSet, bg.correct = TRUE, normalize = "controls")
So your data is normalized. Why don't you compare the raw beta values to the GEO sample submissions? If those are the same, then good, those submissions are raw beta values. If they are not the same, then maybe they are processed betas and you'll need to find out their normalization step (details are missing in the article) in order to exactly reproduce their data.
You are right. When I compared the raw beta values, they were very similar. Thanks for the help.
From Tajuddin, Salman salman.tajuddin@nih.gov
I was going through your 450K data analysis with minif, located here http://kasperdanielhansen.github.io/genbioconductor/html/minfi.html and encountered an error when merging the GEO phenotype data with the methylation data. Not sure what went wrong here.
The error message reads like this:
sessionInfo() is below for your reference
Thanks for your help.
Salman