rnabioco / valr

Genome Interval Arithmetic in R
http://rnabioco.github.io/valr/
Other
88 stars 25 forks source link

implement read_bigwig #379

Closed jayhesselberth closed 2 years ago

jayhesselberth commented 3 years ago

Should we add to valr? Requires rtracklayer import.

#' Read in a bigwig file into a valr compatible bed tibble
#' @description This function will output a 5 column tibble with
#' zero-based chrom, start, end, score, and strand columns.
#' 
#' @param path path to bigWig file
#' @param set_strand strand to add to output (defaults to "+")
#' @export 
read_bigwig <- function(path, set_strand = "+") {
  # note that rtracklayer will produce a one-based GRanges object
  rtracklayer::import(path) %>% 
    dplyr::as_tibble(.) %>% 
    dplyr::mutate(chrom = as.character(seqnames),
                  start = start - 1L, 
                  strand = set_strand) %>% 
    dplyr::select(chrom, start, end, score, strand)
}
kriemo commented 3 years ago

I think that's a good idea. I've also usedrtracklayer::import to read gtf or gff into data.frames, so it might be useful to have a read_gtf() variant as well.

e.g.

#' Import and convert GTF/GFF from rtracklayer into tidy bed format
#' @param path path to gtf or gff file
#' @param zero_based_coords if TRUE, convert to zero based
read_gtf <- function(path, zero_based_coords = TRUE){
  gtf <- rtracklayer::import(path)
  gtf <- as.data.frame(gtf)
  gtf <- dplyr::mutate_if(gtf, is.factor, as.character)
  res <- dplyr::rename(gtf, chrom = seqnames)

  if(zero_based_coords) {
    res <- dplyr::mutate(res, start = start - 1L)
  }

  tibble::as_tibble(res)
} 
kriemo commented 2 years ago

closed by #382