NCBI-Hackathons / Scan2CNV

MIT License
1 stars 0 forks source link

plan to read in idat files and normalize #42

Open ekarlins opened 7 years ago

ekarlins commented 7 years ago

I'm going to create a new branch named after this issue to start implementing this plan, wrapped with snakemake:

read in the idat files using R "illuminaio". I'll generalize the code below:

RedIdatIntensityMatrix <- function(path){
        require(illuminaio)
    redIdats <- dir(path, pattern = "_Red.idat")
    red1 <- readIDAT(paste(path, redIdats[1], sep = "/"))
    m <- cbind(red1$Quants[,1])
    for (i in 2:length(redIdats)){
                redI <- redIdats[i]
            red <- readIDAT(paste(path, redI, sep = "/"))
            m <- cbind(m, red$Quants[,1])
    }
        m
}

Run quantile normalization, one channel at a time (Red and Grn) and split into sub-bead pools, as described in this document: http://dnatech.genomecenter.ucdavis.edu/wp-content/uploads/2013/06/illumina_gt_normalization.pdf

The sub-bead pool ID is in the manifest as "BeadSetID".

I need to generalize the code below and take it to LRR and BAF calculations and write to file.

quantile normalization:

require(preprocessCore)
r <- RedIdatIntensityMatrix(path)
probesToKeep <- annotation$AddressA_ID[annotation$BeadSetID == myBeadID]
newR <- r[as.character(probesToKeep),]
newNormR <- normalize.quantiles(newR)