kfuku52 / csubst

Molecular convergence detection
BSD 3-Clause "New" or "Revised" License
25 stars 1 forks source link

Questions for calculating convergent gene #43

Closed YecLab closed 1 year ago

YecLab commented 1 year ago

Hi, Thanks for your tools, your application is very interesting! I have several qestions:

  1. Does the "csubst analyze" command is like the branch site model in PAML?
  2. Here, I filtered out convergent gene step by step, and I don't know if it is correct: (1) I got the "cutoff_stat" in the "csubst_cb_stats.tsv" file (e.g. in your example is OCNany2spe=2.0 omegaCany2spe=5.0) (2) If the "csubst_cb_2.tsv" file have data which OCNany2spe>2.0 & omegaCany2spe>5.0, indicates this gene is a convergent gene? Do we need to look at branch_id to make sure the branch is a foreground branch?

Many thanks for your help! I am looking forward for your replying!

kfuku52 commented 1 year ago

Does the "csubst analyze" command is like the branch site model in PAML?

It's more close to the branch model. omega_C is calculated for branch combinations over all sites.

If the "csubst_cb_2.tsv" file have data which OCNany2spe>2.0 & omegaCany2spe>5.0, indicates this gene is a convergent gene? Do we need to look at branch_id to make sure the branch is a foreground branch?

It depends on how you specified foreground. is_fg should be Y if you correctly specify the branch in --foreground.

YecLab commented 1 year ago

I got it, many thanks! You mean a convergent gene of the foreground species should be OCNany2spe>2.0 & omegaCany2spe>5.0 & is_fg=Y

kfuku52 commented 1 year ago

The threshould OCNany2spe>2.0 & omegaCany2spe>5.0 may be changed depending on your purpose, but yes, that's correct.

YecLab commented 1 year ago

Another question is your application seems can handle gene duplicates, can I use all the orthogroups (rather than single copy) to run the model?

kfuku52 commented 1 year ago

Yes, there is no limitation regarding gene duplications and losses, at least in running CSUBST.