Closed JC-therea closed 1 year ago
Hi, some things here: You need to decide how to handle the non UTR genes.
There are 3 ways:
tx_to_use <- filterTranscripts(txdb)
And then do
cds <- cds[tx_to_use] # etc
You create pseudo UTRs, so that all UTRs are always valid (but you will lose 0 length information for later): makeTxdbFromGenome() new_txdb <- loadTxdb()
windows <- startRegion(grl, tx, TRUE, upstream, downstream)
lengths <- widthPerGroup(windows, FALSE)
extend_needed <- function(windows, length, wanted_length, direction = "up") { dif <- length - wanted_length big_enough <- dif >= 0 if (!all(big_enough)) { if (direction == "up") { windows[!big_enough] <- extendLeaders(windows[!big_enough], extension = -dif[!big_enough]) } else { windows[!big_enough] <- extendTrailers(windows[!big_enough], extension = -dif[!big_enough]) }
} return(windows) } new_windows <- extend_needed(...)
new_lengths <- widthPerGroup(new_windows, FALSE) stopifnot(length(unique(new_lengths)) == 1) # All windows must be same size!
Then do :
`windowPerReadLength(<other_args>, windows = extended_windows)`
This will now work.
Secondly 'fraction' is multi meaning in ORFik, here it means 'readlength' is 30 and 31 etc.
Hi, thank you very much for the options.
Firstly, I tried the first and the third pipelines and both worked great!
In case anyone needs it the code for the third pipeline is the following:
# Define the windows to expand
windows <- startRegion(cds, tx, TRUE, upstream = 0, downstream = 10)
lengths <- widthPerGroup(windows, FALSE)
# Your new function
extend_needed <- function(windows, length, wanted_length, direction = "up") {
dif <- length - wanted_length
big_enough <- dif >= 0
if (!all(big_enough)) {
if (direction == "up") {
windows[!big_enough] <- extendLeaders(windows[!big_enough],
extension = -dif[!big_enough])
} else {
windows[!big_enough] <- extendTrailers(windows[!big_enough],
extension = -dif[!big_enough])
}
}
return(windows)
}
# Create the new windows
new_windows <- extend_needed(windows, lengths, 30, direction = "up")
# Check windows size
new_lengths <- widthPerGroup(new_windows, FALSE)
stopifnot(length(unique(new_lengths)) == 1) # All windows must be same size!
# If just a few not done, just remove them (you might hit the chromosome boundary).
hitMap <- windowPerReadLength(grl = cds, tx = tx, reads = footprintsGR, pShifted = FALSE, windows = new_windows)
coverageHeatMap(hitMap, scoring = "transcriptNormalized")
The code produces the following figure which is not precise at all but it's ok as far as the df it is (On the df the peak of 30mers is at -12 and in 31mers at -13).
Secondly, I apologize because I got the name wrong. As in the figure produced by coverageHeatMap()
the legend indicates "transcript normalized" I thought that the score (not the fraction) was normalized for the transcript length.
Great, will close this issue then, if you have no more questions ? :)
Also check out our new page: RiboCrypt.org, where we host thousands of precomputed Ribo-seq datasets with easy interactive visualizations. Let me know if you have any questions about that too :)
Sure!
Is there a way to do the same but with orfik experiments? I am planning to do the same but with more samples.
Hello,
Very nice tool! However, I am facing some problems with custom annotations...
The function where I have problems is windowPerReadLength(). After following the pipeline I had an unexpexted result:
As you can see it doesn't start in a position before the CDS and also the fraction is not a real fraction...
This is how it looks the inputs:
The annotation is a custom gtf that has some 5'UTR annotated but not for all the transcripts (around 3800 transcripts)
Any suggestions on how to resolve this issue would be fantastic. Thank you!