ddarriba / modeltest

Best-fit model selection
GNU General Public License v3.0
72 stars 21 forks source link

Full alignment or polymorphic sites only? #32

Open apredeus opened 4 years ago

apredeus commented 4 years ago

Hello Diego,

I have a multiple alignment of 3000 bacterial genomes. Full alignment is about 5 Mb long, but only 50k sites are polymorphic (about 20k are AGCT only). Should I use full alignment or polymorphic sites only to determine the best model? Thank you in advance.

ddarriba commented 4 years ago

Hi "apredeus",

I think it is better to use the whole alignment, unless if you don't care about the branch lengths and model parameters. You should not observe a significant difference in computational time or memory requirements, because those 20k sites will be actually compressed into just 4, having their corresponding weights.

Best, Diego

apredeus commented 4 years ago

Thank you very much for your reply. The analysis was taking days, so I wasn't sure if I was doing something wrong.

ddarriba commented 4 years ago

Definitely with such a large number of taxa it can take several days of analysis, particularly if you are doing a sequential analysis. Which run arguments are you using?