Open balwierz opened 3 years ago
Dear Piotr,
Rownames of the scorematrix should correspond to the windows which were kept during the overlap.
You can also get around this by setting up the seqlevels (from one of your previous examples): seqlengths(gr3) = seqlengths(gr4) = 10
Cheers,
v
Hi,
Thanks, I know about the rownames and I am using them. I also always fix seqInfo of any GRanges object I create in R. I started this not for myself, but rather for other users. As I know some people never assign seqInfo.
I am currently using a slightly slower, high-level variation on ScoreMatrix to keep all the windows and keep window:target orientation pairs only (https://github.com/BIMSBbioinfo/genomation/issues/108)
ScoreMatrixPiotr <- function(target, windows, ignore.strand=FALSE)
{
stopifnot(max(width(windows)) == min(width(windows)))
summariseOneRow <- function(target, window)
{
as.numeric(append(coverage(ranges(target)), Rle(0L, 1000000000))[ranges(window)])
}
ret <- matrix(0, ncol = length(windows), nrow=width(windows[1])) # this will be transposed
o <- findOverlaps(target, windows, ignore.strand=ignore.strand) %>%
as_tibble() %>%
group_by(subjectHits) %>%
summarise(data=list(summariseOneRow(target[queryHits], windows[subjectHits[1]])))
rowI <- pull(o, subjectHits)
data <- pull(o, data)
strands <- strand(windows) %>% as.vector()
for(i in seq_along(rowI))
{
if(strands[rowI[i]] == "-")
ret[ , rowI[i]] <- rev(data[i][[1]])
else
ret[ , rowI[i]] <- data[i][[1]]
}
t(ret)
}
Currently ScoreMatrix does not keep all windows in the output. It removes all
windows
which overlap exactly 0 features intarget
. This causes two research problems:Personally, I cannot imagine a situation where removing windows would be beneficial. No overlap is a valid research result. Zero signal lies on the continuum next to very low signal (which is included).