datasnakes / OrthoEvolution

An easy to use and comprehensive python package which aids in the analysis and visualization of orthologous genes. 🐵
https://orthoevolution.readthedocs.io/en/master/
29 stars 4 forks source link

ETE3PAML - Clean and polish #35

Closed sdhutchins closed 6 years ago

sdhutchins commented 7 years ago

@sdhutchins needs to simplify this and integrate IQtree

grabear commented 7 years ago

The IQ-Tree wrapper for the command line is pretty basic, but it works. It gives you very basic functionality (-s and -st parameters).

grabear commented 7 years ago

` iqtree --help IQ-TREE version 1.5.5 for Linux 64-bit built Jun 2 2017 Copyright (c) 2011-2017 by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt, and Arndt von Haeseler.

Usage: iqtree -s [OPTIONS]

GENERAL OPTIONS: -? or -h Print this help dialog -s Input alignment in PHYLIP/FASTA/NEXUS/CLUSTAL/MSF format -st BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect) -q Edge-linked partition model (file in NEXUS/RAxML format) -spp Like -q option but allowing partition-specific rates -sp Edge-unlinked partition model (like -M option of RAxML) -t or -t BIONJ or -t RANDOM Starting tree (default: 99 parsimony tree and BIONJ) -te Like -t but fixing user tree (no tree search performed) -o Outgroup taxon name for writing .treefile -pre Prefix for all output files (default: aln/partition) -seed Random seed number, normally used for debugging purpose -v, -vv, -vvv Verbose mode, printing more messages to screen -quiet Quiet mode, suppress printing to screen (stdout) -keep-ident Keep identical sequences (default: remove & finally add) -safe Safe likelihood kernel to avoid numerical underflow -mem RAM Maximal RAM usage for memory saving mode

CHECKPOINTING TO RESUME STOPPED RUN: -redo Redo analysis even for successful runs (default: resume) -cptime Minimum checkpoint time interval (default: 20)

LIKELIHOOD MAPPING ANALYSIS: -lmap <#quartets> Number of quartets for likelihood mapping analysis -lmclust NEXUS file containing clusters for likelihood mapping -wql Print quartet log-likelihoods to .quartetlh file

NEW STOCHASTIC TREE SEARCH ALGORITHM: -ninit Number of initial parsimony trees (default: 100) -ntop Number of top initial trees (default: 20) -nbest Number of best trees retained during search (defaut: 5) -n <#iterations> Fix number of iterations to stop (default: auto) -nstop Number of unsuccessful iterations to stop (default: 100) -pers Perturbation strength for randomized NNI (default: 0.5) -sprrad Radius for parsimony SPR search (default: 6) -allnni Perform more thorough NNI search (default: off) -g (Multifurcating) topological constraint tree file

ULTRAFAST BOOTSTRAP: -bb <#replicates> Ultrafast bootstrap (>=1000) -bsam GENE|GENESITE Resample GENE or GENE+SITE for partition (default: SITE) -wbt Write bootstrap trees to .ufboot file (default: none) -wbtl Like -wbt but also writing branch lengths -nm <#iterations> Maximum number of iterations (default: 1000) -nstep <#iterations> #Iterations for UFBoot stopping rule (default: 100) -bcor Minimum correlation coefficient (default: 0.99) -beps RELL epsilon to break tie (default: 0.5)

STANDARD NON-PARAMETRIC BOOTSTRAP: -b <#replicates> Bootstrap + ML tree + consensus tree (>=100) -bc <#replicates> Bootstrap + consensus tree -bo <#replicates> Bootstrap only

SINGLE BRANCH TEST: -alrt <#replicates> SH-like approximate likelihood ratio test (SH-aLRT) -alrt 0 Parametric aLRT test (Anisimova and Gascuel 2006) -abayes approximate Bayes test (Anisimova et al. 2011) -lbp <#replicates> Fast local bootstrap probabilities

MODEL-FINDER: -m TESTONLY Standard model selection (like jModelTest, ProtTest) -m TEST Standard model selection followed by tree inference -m MF Extended model selection with FreeRate heterogeneity -m MFP Extended model selection followed by tree inference -m TESTMERGEONLY Find best partition scheme (like PartitionFinder) -m TESTMERGE Find best partition scheme followed by tree inference -m MF+MERGE Find best partition scheme incl. FreeRate heterogeneity -m MFP+MERGE Like -m MF+MERGE followed by tree inference -rcluster Percentage of partition pairs (relaxed clustering alg.) -mset program Restrict search to models supported by other programs (raxml, phyml or mrbayes) -mset m1,...,mk Restrict search to models in a comma-separated list (e.g. -mset WAG,LG,JTT) -msub source Restrict search to AA models for specific sources (nuclear, mitochondrial, chloroplast or viral) -mfreq f1,...,fk Restrict search to using a list of state frequencies (default AA: -mfreq FU,F; codon: -mfreq ,F1x4,F3x4,F) -mrate r1,...,rk Restrict search to a list of rate-across-sites models (e.g. -mrate E,I,G,I+G,R is used for -m MF) -cmin Min #categories for FreeRate model [+R] (default: 2) -cmax Max #categories for FreeRate model [+R] (default: 10) -merit AIC|AICc|BIC Optimality criterion to use (default: all) -mtree Perform full tree search for each model considered -mredo Ignore model results computed earlier (default: reuse) -madd mx1,...,mxk List of mixture models to also consider -mdef A model definition NEXUS file (see Manual)

SUBSTITUTION MODEL: -m DNA: HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef, TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model specification (e.g., 010010 = HKY) Protein: LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM, JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb, HIVw, JTTDCMut, FLU, Blosum62, GTR20 Protein mixture: C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X Binary: JC2 (default), GTR2 Empirical codon: KOSI07, SCHN05 Mechanistic codon: GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K, MG1KTS, MG1KTV, MG2K Semi-empirical codon: XX_YY where XX is empirical and YY is mechanistic model Morphology/SNP: MK (default), ORDERED Otherwise: Name of file containing user-model parameters (rate parameters and state frequencies) -m +F or +FO or +FU or +FQ (default: auto) counted, optimized, user-defined, equal state frequency -m +F1x4 or +F3x4 Codon frequencies -m +ASC Ascertainment bias correction for morphological/SNP data -m "MIX{m1,...mK}" Mixture model with K components -m "FMIX{f1,...fK}" Frequency mixture model with K components -mwopt Turn on optimizing mixture weights (default: auto)

RATE HETEROGENEITY AMONG SITES: -m +I or +G[n] or +I+G[n] or +R[n] Invar, Gamma, Invar+Gamma, or FreeRate model where n is number of categories (default: n=4) -a Gamma shape parameter for site rates (default: estimate) -amin Min Gamma shape parameter for site rates (default: 0.02) -gmedian Median approximation for +G site rates (default: mean) --opt-gamma-inv More thorough estimation for +I+G model parameters -i Proportion of invariable sites (default: estimate) -wsr Write site rates to .rate file -mh Computing site-specific rates to .mhrate file using Meyer & von Haeseler (2003) method

SITE-SPECIFIC FREQUENCY MODEL: -ft Input tree to infer site frequency model -fs Input site frequency model file -fmax Posterior maximum instead of mean approximation

CONSENSUS RECONSTRUCTION: -t Set of input trees for consensus reconstruction -minsup Min split support in range [0,1]; 0.5 for majority-rule consensus (default: 0, i.e. extended consensus) -bi Discarding trees at beginning of -con Computing consensus tree to .contree file -net Computing consensus network to .nex file -sup Assigning support values for to .suptree -suptag Node name (or ALL) to assign tree IDs where node occurs

ROBINSON-FOULDS DISTANCE: -rf_all Computing all-to-all RF distances of trees in -rf Computing all RF distances between two sets of trees stored in and -rf_adj Computing RF distances of adjacent trees in

TREE TOPOLOGY TEST: -z Evaluating a set of user trees -zb <#replicates> Performing BP,KH,SH,ELW tests for trees passed via -z -zw Also performing weighted-KH and weighted-SH tests -au Also performing approximately unbiased (AU) test

GENERATING RANDOM TREES: -r Create a random tree under Yule-Harding model -ru Create a random tree under Uniform model -rcat Create a random caterpillar tree -rbal Create a random balanced tree -rcsg Create a random circular split network -rlen min, mean, and max branch lengths of random trees

MISCELLANEOUS: -wt Write locally optimal trees into .treels file -blfix Fix branch lengths of user tree passed via -te -blscale Scale branch lengths of user tree passed via -t -blmin Min branch length for optimization (default 0.000001) -blmax Max branch length for optimization (default 100) -wsr Write site rates and categories to .rate file -wsl Write site log-likelihoods to .sitelh file -wslr Write site log-likelihoods per rate category -wslm Write site log-likelihoods per mixture class -wslmr Write site log-likelihoods per mixture+rate class -wspr Write site probabilities per rate category -wspm Write site probabilities per mixture class -wspmr Write site probabilities per mixture+rate class -wpl Write partition log-likelihoods to .partlh file -fconst f1,...,fN Add constant patterns into alignment (N=#nstates) -me LogL epsilon for parameter estimation (default 0.01) --no-outfiles Suppress printing output files `

sdhutchins commented 7 years ago

I can't stand giving the thumbs up here..soooo... :smiley:

sdhutchins commented 7 years ago

:stuck_out_tongue_winking_eye:

sdhutchins commented 7 years ago

@grabear https://www.webpagefx.com/tools/emoji-cheat-sheet/

grabear commented 7 years ago

:stuck_out_tongue_closed_eyes:

sdhutchins commented 6 years ago

@grabear I'm going to close this out tonight.

grabear commented 6 years ago

@sdhutchins Good deal!

sdhutchins commented 6 years ago

aada0313460b758afa614bf7b9767a4ccd56e1ed - has the updated ETE3PAML class.

Any thoughts or comments @grabear?

sdhutchins commented 6 years ago

Addressed by #99