dieterich-lab / rp-bp

Rp-Bp is a Bayesian approach to predict, at base-pair resolution, ribosome occupancy and translation.
MIT License
7 stars 5 forks source link

sORFs not found or filtered out #92

Closed TamaraO closed 6 years ago

TamaraO commented 6 years ago

I am running Rp-bp on a bunch of Ribo-seq data for a cell line with mass spec-validated short ORFs (so I know they are definitely there). Rp-bp is able to predict some short ORFs, but seems to miss a lot of them, although I can see clear trinucleotide periodicity and Ribo-seq reads supporting those ORFs. Some of them are there in the unfiltered ORF list, but get filtered out, while others are not there at all. Are there some parameters I can change to make Rp-bp more "forgiving" to allow identification of more short ORFs, both before and after filtering?

eboileau commented 6 years ago

So these ORFs do appear in the profiles, but they are not predicted as being translated?

A number of options are handled via the configuration file, see Rp-Bp specific options and also selecting predicted orfs. The smoothing options for instance may affect the Bayes factor estimates, while other options such as min_bf_mean will only affect the final selection. Have you looked e.g. at the actual Bayes factor mean and variance for these ORFs (in *predicted-orfs.bed.gz)? This would be informative.

Other filters are implemented by default to ensure e.g that the first frame gets enough and more reads than the other two. Finally, in the filtered predictions only the longest translated ORF at each stop codon is selected, also only the ORF with the highest estimated Bayes factor is kept among overlapping ORFs, so this is possible that some of these smaller ORFs are filtered out. In this case, you may want to use the unfiltered predictions.