stschiff / msmc2

GNU General Public License v3.0
53 stars 9 forks source link

Question about plotting RCCR result and -s option in cross population analysis #58

Open hungweichen0327 opened 1 year ago

hungweichen0327 commented 1 year ago

Dear @stschiff,

  1. According to the general guide (https://github.com/stschiff/msmc-tools/blob/master/msmc-tutorial/guide.md), the partial script to plot the RCCR result is:

    xlim=c(1000,500000),ylim=c(0,1), type="n", xlab="Years ago", ylab="relative cross-coalescence rate"

I would like to know (1) "ylim = c(0,1)" (2) scaling the the value from 0 to 1, which one is better?

  1. Another question is about the -s option when running cross-population analysis.

    The -s flag tells MSMC to skip sites with ambiguous phasing.

Should I used this option for 3 runs, within each of two populations and between them? Or just between two populations? 3 runs are shown as below:

(1) build/release/msmc2 -I 0,1,2,3 -o within1_msmc ... (2) build/release/msmc2 -I 4,5,6,7 -o within2_msmc ... (3) build/release/msmc2 -I 0-4,0-5,0-6,0-7,1-4,1-5,1-6,1-7,2-4,2-5,2-6,2-7,3-4,3-5,3-6,3-7 -o across_msmc ...

In other words, add "-s option" for all of them, or just add "-s option" for (3)?

Thank you for the help.

stschiff commented 1 year ago

Hi @hungweichen0327.

1) Definitely not scale. You can just plot it without ylim and see how it looks. 2) no clear answer. Try both. In my experience, for population size estimates, -s doesn't make much of a difference, but for cross-population analyses it does. Try it!

hungweichen0327 commented 1 year ago

Dear @stschiff, Thank you for the kind reply. For question 1, is it correct to set "ylim = c(0,1)" when plotting the RCCR result? Sometimes, the RCCR value is more than one in some time points, and setting ylim = c(0,1) will have a great pattern in the RCCR result, just like the figure in the guide of MSMC2 below.

ccrPlot