PeeperLab / CopywriteR

DNA copy number detection from off-target sequence data
GNU General Public License v3.0
28 stars 10 forks source link

Optional parameter to not perform GC/Mappability correction #7

Closed martijn-cordes closed 8 years ago

martijn-cordes commented 8 years ago

Hi Thomas,

I would like to perform my own GC/mappability corrections and blacklists on the readcount data which comes out of CopywriteR. It would be nice to have an option to set a parameter to enable this. I now did this by hand in the following function ( correctmappa = FALSE) after downloading the tarball and compiling the package afterwards. Not the best way to go if I want update CopywriteR in the future ;)

Perform GC-content and mappability corrections (in .tng helper function)

tryCatch({
    i <- c(seq_len(ncol(data$cov)))
    NormalizeDOC <- function(i, data, .tng, usepoints, destination.folder) {
        .tng(data.frame(count = data$cov[, i], gc = data$anno$gc,
                        mappa = data$anno$mappa),
             use = usepoints & data$cov[, i] != 0, correctmappa = TRUE,
             plot = file.path(destination.folder, "qc",
                              paste0(colnames(data$cov)[i], ".png")))
    }
    ratios <- bplapply(i, NormalizeDOC, data, .tng, usepoints,
                       destination.folder, BPPARAM = bp.param)
    log2.read.counts <- matrix(unlist(ratios), ncol = length(sample.indices))
}, error = function(e) {
    stop(.wrap("The GC-content and mappability normalization did not work",
               "due to a failure to calculate loesses. This can generally",
               "be solved by using larger bin sizes. Stopping execution of",
               "the remaining part of the script..."))
})
colnames(log2.read.counts) <- paste0("log2.", sample.files[sample.indices])
thomasKuilman commented 8 years ago

Hi Martijn,

Although these data are not available in the output, the data can be easily retrieved from the read_counts.txt file. The difference between these data and those in the log2_read_counts.igv file is that the latter are corrected for GC-content and mappability, that a CNV filter has been applied, and that the data are log2-transformed and median-normalized. In other words, for what you want tot do, you can simply log2-transform and median-normalize the read count data and then perform your custom analyses.

I hope that helps for now; I will think whether we should implement this as a feature in CopywriteR.

Best,

Thomas


Thomas Kuilman, PhD Department of Molecular Oncology Netherlands Cancer Institute 1066 CX Amsterdam The Netherlands

Phone: +31-20-5121841

On 24 Jul 2015, at 12:57, Martijn Cordes notifications@github.com<mailto:notifications@github.com> wrote:

Hi Thomas,

I would like to perform my own GC/mappability corrections and blacklists on the readcount data which comes out of CopywriteR. It would be nice to have an option to set a parameter to enable this. I now did this by hand in the following function ( correctmappa = FALSE) after downloading the tarball and compiling the package afterwards. Not the best way to go if I want update CopywriteR in the future ;)

Perform GC-content and mappability corrections (in .tng helper function)

tryCatch({ i <- c(seq_len(ncol(data$cov))) NormalizeDOC <- function(i, data, .tng, usepoints, destination.folder) { .tng(data.frame(count = data$cov[, i], gc = data$anno$gc, mappa = data$anno$mappa), use = usepoints & data$cov[, i] != 0, correctmappa = TRUE, plot = file.path(destination.folder, "qc", paste0(colnames(data$cov)[i], ".png"))) } ratios <- bplapply(i, NormalizeDOC, data, .tng, usepoints, destination.folder, BPPARAM = bp.param) log2.read.counts <- matrix(unlist(ratios), ncol = length(sample.indices)) }, error = function(e) { stop(.wrap("The GC-content and mappability normalization did not work", "due to a failure to calculate loesses. This can generally", "be solved by using larger bin sizes. Stopping execution of", "the remaining part of the script...")) })

colnames(log2.read.counts) <- paste0("log2.", sample.files[sample.indices])

� Reply to this email directly or view it on GitHubhttps://github.com/PeeperLab/CopywriteR/issues/7.