vagarwal87 / TargetScanTools

A suite of tools for evolutionary and functional analysis of microRNA binding sites http://www.targetscan.org
MIT License
40 stars 8 forks source link

TargetScan supplementary tools

This repository is intended to accompany our submission. For more information please refer to:

Agarwal V, Subtelny AO, Thiru P, Ulitsky I, Bartel DP. Predicting microRNA targeting efficacy in Drosophila. Genome Biology, 19:152. (2018).

The code is released to enhance reproducibility and as a suite of complementary tools to TargetScan in the hope it might help others in the future who work on new datasets.

These tools can be used in a variety of organisms to:

For a codebase to compute context scores for flies or other insects while incorporating 3' UTR isoform information, the code provided in TargetScan is recommended for use instead of this code.

If you find our code or our precomputed fly miRNA target predictions to be helpful for your work, please cite the paper above.

To better understand the methodological details for the evolutionary analyses, the following resources describe the original implementation [1], extension of parameters to worm and fly [2], and the re-implemented pipeline [3].

  1. Friedman RC, Farh KK, Burge CB, Bartel DP. Most Mammalian mRNAs Are Conserved Targets of MicroRNAs. Genome Research, 19:92-105 (2009).
  2. Jan CH, Friedman RC, Ruby JG, Bartel DP. Formation, regulation and evolution of Caenorhabditis elegans 3'UTRs. Nature, 469:97-102. (2011).
  3. Agarwal V, Bell GW, Nam J, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. eLife, 4:e05005, (2015).

Dependencies for running entire pipeline (mostly optional)

Instructions for use

Not all code may work immediately because some pieces depend on computing environment, and not all intermediate files are provided because some are too large. For R code to work properly, please copy the contents of .Rprofile in this folder to your local .Rprofile. Exporting the allfxns.pm module to PERL5LIB might also be required.

Users are advised to read the code closely and modify commented pieces as appropriate to acquire desired output for your environment. For example, you will need to download all of the additional R library and Perl module dependencies for the code to work. This being said, if you find crucial files are missing, making the code unusable, or if you identify a major problem in the code, please raise a Github issue.

In each Figure's folder, change directories to it and then run the script "bash runme.sh". Please read this file first as it provides a general overview of relevant commands that were used sequentially to pre-process the data and generate the figures. This script should be able to run on the precomputed data provided in the folder to generate the figures.

Additional notes

Our naming convention is slightly different in the code than in the paper. In particular, the "HYBRIDSCORE" and "PLFOLD" features in the code are equivalent to "3p_energy" and "SA" features in the paper, respectively.