Open kasperdanielhansen opened 6 years ago
FWIW, and to do more than simply +1 this issue, perhaps provoke a reaction, or help a fellow traveler, here is my workaround for same:
grlGaps<-function(grl) {
## ALAS: [gaps does not work on a
## GRangesList](https://github.com/Bioconductor/GenomicRanges/issues/17)
##
## Here is quick-to-code but slow-to-run implementation:
## return(endoapply(grl,gaps,NA,NA))
## NB: the NA arguments are required to on include "internal"
## gaps.
##
## By my timing, the below is about 175x faster than endoapply.
## It requires grl to be 'named' uniquely. There may be some
## other ways its assumptions are not what you would expect by
## gaps on a GRangesList.
stopifnot(0<length(names(grl)))
stopifnot(0==anyDuplicated(names(grl)))
r<-unlist(range(grl))
girl<-gaps(ranges(grl),NA,NA) # gaps as an iranges list
l<-lengths(girl)
rdf<-as.data.frame(r)
lkup<-rdf[names(girl),]
res<-GRanges(rep(lkup$seqnames,l),unlist(girl),rep(lkup$strand,l))
res<-split(unname(res),names(res))
missing<-setdiff(names(grl),names(res))
missing<-`names<-`(makeGRangesListFromFeatureFragments(missing
,rep(list(NULL),length(missing))
,rep(list(NULL),length(missing))),
missing)
res<-c(res,missing)[names(grl)] # put them in the same order as grl.
res
}
In most cases, it's often preferable to do something like getting the transcript ranges and then doing a psetdiff(tx, introns_or_exons)
.
Thanks @lawremi - even faster and cleaner your way, which method I had forgotten about. So, my function (if we still want a function at all) can be re-written
grlGaps<-function(grl) {
## for discussion, c.f. [gaps does not work on a GRangesList](https://github.com/Bioconductor/GenomicRanges/issues/17) .
psetdiff(unlist(range(grl),use.names=FALSE),grl))
}
gaps()
does not work on a GRangesList:I would expect it to work :) Running
endoapply(grIntrons.24, gaps)
takes forever.