popgenmethods / smcpp

SMC++ infers population history from whole-genome sequence data.
GNU General Public License v3.0
149 stars 32 forks source link

Questions about SMC++ plotting #247

Open jaurbanChicago opened 11 months ago

jaurbanChicago commented 11 months ago

Hi,

First of all, I hope this email finds you well! I am trying to implement smc++ and I am using 1000 Genomes Project data as a test. I am using mappability bed files from https://share.eva.mpg.de/index.php/s/ygfMbzwxneoTPZj, and I've been playing a bit with some parameters.

I am using all samples from YRI and CEU populations and my command is: smc++ estimate --outdir analysis_test/ --timepoints 1 350000 --spline cubic --base YRI.cubic.timepoints.allsamples --cores 8 1.25e-8 out_test/YRI.WGS.allsamples.chr*.smc.gz and my plotting command is: smc++ plot YRI.CEU.cubic.timepoints.allsamples.final.png -c -g 30 --cores 5 analysis_test/YRI.cubic.timepoints.allsamples.final.json analysis_test/CEU.cubic.timepoints.allsamples.final.json

I've been getting Ne trajectories that do make sense for these populations, but the scaling on the x-axis is quite off. The program is placing the out-of-Africa bootleneck at 10^2 years ago and it happened ~50-80k years ago. I've been trying to change some parameters and I am still getting plots similar to the one below. Do you have any suggestions as to why the timing could be off? I would really appreciate some help here. Thanks a lot in advance. image