davideyre / runListCompare

Other
0 stars 4 forks source link

runListCompare: maximum likelihood sequence comparison, corrected for recombination

runListCompare.py and associated scripts provide a Python wrapper for generating maximum likelihood phylogenies from a list of fasta consensus sequence files obtained from mapping to the same reference. The script enables large numbers of samples to be initially handled in parallel and clustered with similar sequences based on a SNP threshold before calculating maximum likelihood trees for each cluster using either PhyML or IQTree. Correction for recombination is done with ClonalFrameML.

Requirements

Installation

  1. Install dependencies using Conda (or Mamba to save time):
    • conda create -n runlistcompare -c bioconda python=3 biopython=1.73 clonalframeml iqtree pytest treeswift networkx phyml
  2. Download and decompress the latest release, and cd into it
  3. Activate and test installation
    • conda activate runlistcompare
    • pytest (takes ~2mins)

Python 2.7 version

  1. Download and decompress the release, and cd into it
  2. conda env create -f conda_python2.yml
  3. conda activate runlistcompare2

Usage

python runListCompare.py tests/data/ec/ec.ini

Here test.ini is an ini file containing the desired parameters. It is advisable to run the above command to test that things are working with the included demo data. Input sequences are listed in a tab separated format, and an example is provided in tests/data/ec/ec.seqlist.txt. The first column can be up to 8 characters in length and is used for tip labels of the final trees, a requirement imposed by ClonalFrameML.

Important configurable parameters to consider include:

Output files


David Eyre & Bede Constantinides
david.eyre@bdi.ox.ac.uk
17 April 2019