Closed TamaraO closed 6 years ago
So these ORFs do appear in the profiles, but they are not predicted as being translated?
A number of options are handled via the configuration file, see Rp-Bp specific options and also selecting predicted orfs. The smoothing options for instance may affect the Bayes factor estimates, while other options such as min_bf_mean
will only affect the final selection. Have you looked e.g. at the actual Bayes factor mean and variance for these ORFs (in *predicted-orfs.bed.gz)? This would be informative.
Other filters are implemented by default to ensure e.g that the first frame gets enough and more reads than the other two. Finally, in the filtered predictions only the longest translated ORF at each stop codon is selected, also only the ORF with the highest estimated Bayes factor is kept among overlapping ORFs, so this is possible that some of these smaller ORFs are filtered out. In this case, you may want to use the unfiltered predictions.
I am running Rp-bp on a bunch of Ribo-seq data for a cell line with mass spec-validated short ORFs (so I know they are definitely there). Rp-bp is able to predict some short ORFs, but seems to miss a lot of them, although I can see clear trinucleotide periodicity and Ribo-seq reads supporting those ORFs. Some of them are there in the unfiltered ORF list, but get filtered out, while others are not there at all. Are there some parameters I can change to make Rp-bp more "forgiving" to allow identification of more short ORFs, both before and after filtering?