sjspielman / pyvolve

Python library to simulate evolutionary sequence data
Other
78 stars 23 forks source link

simulate C10-C60 heterogeneous model #20

Closed yuanning-li closed 3 years ago

yuanning-li commented 3 years ago

Hi, is there a way to simulate amino acid sequences based on C10-C60 heterogeneous model?

sjspielman commented 3 years ago

Hi @yuanning-li,

Can you send me a reference for this model?

pyvolve is very flexible to specify your own model matrices, which you may wish to use here, too. This is described it the user manual linked from the README.

Thanks, Stephanie

yuanning-li commented 3 years ago

Hey Stephanie:

Thanks for your quick response. Here is the reference of the paper https://academic.oup.com/bioinformatics/article/24/20/2317/260174. This model is also implemented in IQ-tree or PhyloBayes.

Essentially I want to simulate amino acid alignments with different substitutional categories (allow heterogeneous equilibrium frequencies that differ across sites).

I am trying to look at the manual but not exactly sure how to implement here.

Best, Li On Nov 6, 2020, at 11:38 AM, Stephanie notifications@github.com<mailto:notifications@github.com> wrote:

Hi @yuanning-lihttps://github.com/yuanning-li,

Can you send me a reference for this model?

pyvolve is very flexible to specify your own model matrices, which you may wish to use here, too. This is described it the user manual linked from the README.

Thanks, Stephanie

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/sjspielman/pyvolve/issues/20#issuecomment-723208490, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACQQLV5TIYHBI6TT5Z2L223SOQYATANCNFSM4TM3XFIA.

sjspielman commented 3 years ago

Hi @yuanning-li,

Ah yes, I do know this model. Thanks for the reminder! This model is not currently implemented, and it's not something I will be able to do in the near future.

What you can do is something like:

1) Define many partitions where each uses a different amino acid matrix (either a standard one or one you can code up yourself as a custom matrix). This will add comparable levels of ASRV to what C10-60 accomplishes in inferences. 2) You can use the mutation-selection model framework (aka halpern-bruno model) to specify a model with specific amino acid (or codon level to get codon bias in there) preferences. Then, create a separate partition for each of your mutation-selection models to achieve ASRV comparable to C10-60.

I hope these ideas can help!

-Stephanie