amkozlov / raxml-ng

RAxML Next Generation: faster, easier-to-use and more flexible
GNU Affero General Public License v3.0
379 stars 64 forks source link

Quick and dirty phylogenetic tree of protein sequences #80

Closed mirix closed 4 years ago

mirix commented 4 years ago

Hello,

I have an alignment of highly divergent protein sequences with some 5000 sequences. I would like to derive a quick and dirty phylogenetic tree.

Online servers do not accept that many sequences whereas parallel codes such as RAXML-NG and IQ-TREE take for ever (over two weeks and counting) on a Xeon hexacore at 3.6 GHz. Whereas others require more than the 16GB of RAM I have.

Is it possible to derive a quick and dirty tree with RAXML-NG? If so, what the command line would be? How long should I expect it to take?

Best,

Miro

amkozlov commented 4 years ago

Hi @mirix,

two weeks sounds like a bit too long even for 5000 seqs. Could you please post your log file?

If you need a "quick and dirty" tre, you can use parsimony (--start --tree pars{1}), or ML tree from a single starting tree (eg --search1 or --search --tree pars{1}).

Best, Alexey

mirix commented 4 years ago

T3.raxml.log

amkozlov commented 4 years ago

Well, you are using a wrong model name:

ERROR: ERROR model initialization |PROT| (LIBPLL-5002): Invalid rates symmetry definition: 

Please find a list of available protein models here: https://github.com/amkozlov/raxml-ng/wiki/Input-data#single-model

mirix commented 4 years ago

Sorry, I had overwritten the original log file...

That was the command line:

raxml-ng --msa PDBBind_ali.fasta --model LG --prefix T3 --threads 6 --seed 2

amkozlov commented 4 years ago

ok this looks better :)

but could you please post complete log file with the correct model?

amkozlov commented 4 years ago

@mirix closing since I never got a correct log file form you. Please reopen if you are still interested in this one.