databio / bedboss

Python pipeline for processing BED files for BEDbase
https://docs.bedbase.org
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

regionstat.R should be more verbose if it cannot find the BSGenome package used for calculating and plotting GC content #37

Open donaldcampbelljr opened 7 months ago

donaldcampbelljr commented 7 months ago

Currently, the pipeline may skip calculating GC content if it cannot find the associated BSGenoma package in the namespace. Unfortunately, it does not log this and silently skips attempting to calculate the GC content:

https://github.com/databio/bedboss/blob/b23c0a7dff112b99c7e5b0c3f08ebeb74a3fd74f/bedboss/bedstat/tools/regionstat.R#L95-L98

https://github.com/databio/bedboss/blob/b23c0a7dff112b99c7e5b0c3f08ebeb74a3fd74f/bedboss/bedstat/tools/regionstat.R#L188-L215

Simple solution would be to remove the quietly=TRUE from line 97, so that we capture this info:

Loading required namespace: BSgenome.Hsapiens.UCSC.hg38.masked
Failed with error:  'there is no package called 'BSgenome.Hsapiens.UCSC.hg38.masked''
donaldcampbelljr commented 7 months ago

Should we also add the common ones to our R dependency installation script?

# build BSgenome package ID to check whether it's installed
if ( startsWith(genome, "T2T")){
  BSg = "BSgenome.Hsapiens.NCBI.T2T.CHM13v2.0"
} else {
  if (startsWith(genome, "hg") | startsWith(genome, "grch")) {
    orgName = "Hsapiens"
  } else if (startsWith(genome, "mm") | startsWith(genome, "grcm")){
    orgName = "Mmusculus"
  } else if (startsWith(genome, "dm")){
    orgName = "Dmelanogaster"
  } else if (startsWith(genome, "ce")){
    orgName = "Celegans"
  } else if (startsWith(genome, "danRer")){
    orgName = "Drerio"
  }  else if (startsWith(genome, "TAIR")){
    orgName = "Athaliana"
  } else {
    orgName = "Undefined"
  }
  BSg = paste0("BSgenome.", orgName , ".UCSC.", genome)
}

BSgm = paste0(BSg, ".masked")

In installRdeps.R ?