Closed m-swirski closed 4 years ago
I thought about this point a lot, and I did it because this made me code better, less error when I put in the wrong integers, but will think about it a bit more. I see the point in having the option for full length of integers.
I think that these functions (startRegion etc too, but I didn't check how easy would it be to modify them) should work analogously to intra-range-methods from genomic ranges, etc, to make it consistent with bioconductor. And I really find it useful to have an option of providing whole vector of values rather than just one number.
Ok, good argument, will add it.
Updated extendLeaders and extendTrailers to work for length(grl).
It is possible to update startRegion or more precisely windowPerGroup, but can check out that later if it is needed. Will close issue.
Hi, I was wondering, why restrict extension in these function to a single integer (or a grangeslist)? These seem to work fine with an integer vector with length(extension) == length(grl) And in some cases it's desirable to extend each grl object with a different value;
library(ORFik) attach(loadNamespace("ORFik"))
the only difference is in the first if statement
extendLeaders_multi <- function (grl, extension = 1000L, cds = NULL) { if (is(extension, "numeric") && length(extension) %in% c(1L, length(grl))) { posIndices <- strandBool(grl) promo <- promoters(unlist(firstExonPerGroup(grl), use.names = FALSE), upstream = extension) newStarts <- rep(NA, length(grl)) newStarts[posIndices] <- as.integer(start(promo[posIndices])) newStarts[!posIndices] <- as.integer(end(promo[!posIndices])) } else if (is.grl(class(grl))) { starts <- startSites(extension) changedGRL <- downstreamFromPerGroup(grl[names(extension)], starts) return(changedGRL) } else { stop("extension must either be an integer, or a GRangesList") } extendedLeaders <- assignFirstExonsStartSite(grl, newStarts) if (is.null(cds)) return(extendedLeaders) return(addCdsOnLeaderEnds(extendedLeaders, cds)) }
extendTrailers_multi <- function (grl, extension = 1000L) { if (is(extension, "numeric") && length(extension) %in% c(1L, length(grl))) { posIndices <- strandBool(grl) promo <- flank(unlist(lastExonPerGroup(grl), use.names = FALSE), width = extension, start = FALSE) newEnds <- rep(NA, length(grl)) newEnds[posIndices] <- as.integer(end(promo[posIndices])) newEnds[!posIndices] <- as.integer(start(promo[!posIndices])) } else if (is.grl(class(grl))) { starts <- startSites(extension) changedGRL <- upstreamOfPerGroup(grl[names(extension)], starts, allowOutside = TRUE) return(changedGRL) } else { stop("extension must either be an integer, or a GRangesList") } return(assignLastExonsStopSite(grl, newEnds)) }
gtf_file <- system.file("extdata", "annotations.gtf", package = "ORFik") txdb <- loadTxdb(gtf_file) tx <- exonsBy(txdb, by = "tx", use.names = TRUE) tx[1:3] extendLeaders_multi(tx[1:3], c (10,20,30)) extendTrailers_multi(tx[1:3], c (10,20,30))
So it seems to work without any problems...