lima1 / PureCN

Copy number calling and variant classification using targeted short read sequencing
https://bioconductor.org/packages/devel/bioc/html/PureCN.html
Artistic License 2.0
127 stars 32 forks source link

Return the exon.lrs variable to the returned list object of runAbsoluteCN() #335

Closed tinyheero closed 10 months ago

tinyheero commented 10 months ago

The current implementation of .optimizeGridPurity():

.optimizeGridPurity <- function(p) {
    b <- 2 * (1 - p)
    log.ratio.offset <- 0
    lapply(ploidy.grid, function(D) {
        dt <- p/D
        llik.all <- lapply(seq_along(exon.lrs), function(i) .calcLlikSegmentExonLrs(exon.lrs[[i]],
            log.ratio.offset, max.exon.ratio, sd.seg, dt, b, D, test.num.copy))
        subclonal <- vapply(llik.all, which.max, double(1)) == 1
        subclonal.f <- length(unlist(exon.lrs[subclonal]))/length(unlist(exon.lrs))
        if (subclonal.f > max.non.clonal) return(-Inf)
        sum(vapply(llik.all, max, double(1)))
    })
}

Returns the total copy number likelihood of a given purity (p) and ploidy (D) value. The individual segment maximum likelihood values are not available.

This PR adds the exon.lrs to the input element of the list that is returned from the runAbsoluteCN function call.

Having the values of this variable available means one can run the .calcLlikSegmentExonLrs() function easily. This is helpful for deep diving to understand what segments are driving a higher copy number likelihood for particular purity/ploidy solutions.

lima1 commented 10 months ago

Thanks @tinyheero ! That makes sense. I think I'll rename exon.lrs though, as it is technically not correct anymore. Maybe interval.lrs?

tinyheero commented 10 months ago

Thanks @lima1 for the feedback.

Yeah I agree, interval.lrs is a much more sensible name given the context. I only renamed the output element name rather than the variable name across the codebase to keep the changes minimally. Unless you think the variable name should be changed across the entire function too?

lima1 commented 10 months ago

For now we can keep it minimal. I'll start the merging now. Thanks a lot! Let me know if you find some patterns and have ideas of reducing the impact of artifactual segments.