ShixiangWang / sigminer

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html
https://shixiangwang.github.io/sigminer/
Other
141 stars 18 forks source link

Fragile and silent dependency on 'copynumber' in SV signature tally #457

Closed selkamand closed 3 months ago

selkamand commented 3 months ago

Thanks for your work on this package!

Just noticed that SV getDists function has a couple of issues

  1. It includes a dependency on the copynumber package that is not declared in package documentation / DESCRIPTION file
  2. The dependency: ‘copynumber’, is not available on modern versions of Bioconductor (e.g. version '3.19')
  3. Sigminer uses a non-exported function from copynumber - which makes the dependency more fragile

Would it be possible to

  1. Ideally - eliminate this dependency by bringing this pcf function in-house, or finding an alternate, maintained package that exports similar functionality?
  2. If thats absolutely not possible, would you consider adding documentation on how to install the copynumber package now that its not in Bio-conductor?
getDists <- function(chrom1, pos1, chrom2, pos2, doPCF = FALSE) {
  pos1 <- as.numeric(pos1)
  pos2 <- as.numeric(pos2)
  # get segments per chromosome
  dists <- sapply(unique(c(chrom1, chrom2)), FUN = function(x) {
    bp <- matrix(NA, ncol = 2, nrow = 0)
    # positions separate for each member of breakpoint
    pos <- c()
    index1 <- which(chrom1 == x)
    if (length(index1) > 0) {
      pos <- c(pos, pos1[index1])
      bp <- rbind(bp, cbind(index1, 1))
    }
    index2 <- which(chrom2 == x)
    if (length(index2) > 0) {
      pos <- c(pos, pos2[index2])
      bp <- rbind(bp, cbind(index2, 2))
    }
    # distances between breakpoint
    dists <- diff(sort(pos))
    # segment
    forCN <- data.frame(chrom = x, pos = sort(pos), dist = dists[order(pos)])
    if (!doPCF) {
      return(forCN[, 3])
    }
    pcf = "copynumber"%:::%"pcf"  # The problematic dependency
    return(list(info = forCN, seg = pcf(forCN, gamma = 25, kmin = 10)))
  }, simplify = FALSE)
  return(dists)
}
ShixiangWang commented 3 months ago

@selkamand Hi, thank you for bringing this to my attention. Due to the removal of the copynumber package from Bioc, I had to remove it from the dependencies in v2.2.0 (News). You can install the package from https://github.com/shixiangwang/copynumber. It is generally recommended as I have added some features, although other forks of this package exist on GitHub. I will update the readme to include instructions on how to install the copynumber package.

I had considered directly copying the code of pcf to sigminer; however, the deep dependencies were hard to handle, so I gave up on that approach. This is why I used pcf = "copynumber"%:::%"pcf" to import the function and avoid CRAN check at the same time.

selkamand commented 3 months ago

Thanks @ShixiangWang