Closed mufernando closed 1 year ago
This is working with:
snakemake -c1 --snakefile workflows/sweep_simulate.snake
But, I'm not sure how to integrate this into the main Snakefile
without screwing stuff up -- as this workflow uses a different configfile from the DFE stuff. Do we define separate configfiles for each module in Snakefile
? @andrewkern do you have a suggestion?
Also, I changed things so we are simulating (possibly overlapping) windows instead of the entire chromosome. For now, this'll let us expand out the simulation grid without a huge computational burden. Ultimately the plan is to simulate entire chromosomes but "trim" to the windows -- I'll add a toggle to the config for this behavior. (though this might only be possible in practice with a smaller chromosome)
yeah you could definitely define a separate config file here and in this snakefile point to it
frankly the config stuff could use a bit of an overhaul -- last I looked at it there was a lot of hard coded redundant paths
Cool, thanks -- I'll avoid adding more stuff to Snakefile
for the time being, then.
I simulated chromosome 22 with the Gamma_K17 DFE on exons and no scaling. It took me 1h18min on sesame. And simulating 10Mb of chrom 22 takes 12min.
time stdpopsim -vv -e slim --slim-scaling-factor 1 HomSap -d OutOfAfrica_3G09 YRI:10 --dfe Gamma_K17 --dfe-annotation ensembl_havana_104_CDS -c chr22 -o foo.ts &>log.txt
stdpopsim -vv -e slim --slim-scaling-factor 1 HomSap -d OutOfAfrica_3G09 4689.52s user 220.01s system 100% cpu 1:21:42.95 total_
time stdpopsim -vv -e slim --slim-scaling-factor 1 HomSap -d OutOfAfrica_3G09 YRI:10 --dfe Gamma_K17 --dfe-annotation ensembl_havana_104_CDS -c chr22 --right 10000000 -o foo.ts &>log2.txt
stdpopsim -vv -e slim --slim-scaling-factor 1 HomSap -d OutOfAfrica_3G09 775.53s user 62.33s system 100% cpu 13:50.29 total
wow, very nice. using the cluster we could definitely get 100s of reps for full chr22.
@andrewkern yeah, but if we're also varying sweep location across a fine grid that'll be prohibitively costly, I think. @mufernando just benchmarked further and it takes ~12 minutes to simulate 20% of chr22 with no scaling, and 1.5min with a scaling factor of 4. So I think windowing + scaling is the way to go, as long as we don't do it too aggressively.
i agree for a fine grid, but i think its perhaps worth our time to take one or a small subset of locations and ask, "does full chrom simulation matter much?"
essentially we'd be asking about chrom-wide effects of BGS
yeah I agree! e.g., lay down a sweep in the middle of chr22, and simulate the entire chrom and progressively smaller windows centered around the sweep. If there are boundary effects they'd be visible in patterns of diversity at the window edges, I guess? And a global effect would be reflected by average diversity in the window?
btw I started drafting the diploshic module for this part of the analysis here
Looks great, thanks Murillo! I'm assuming we'll be saving sims as vcf to be fed to the sweep detection methods, so I think we should also add a routine to dump vcf (for the focal window only) alongside the tree sequences.
@mufernando do we need results/simulated_data/sweeps/boundary_effect_bgs.png
in the PR? seems like just an image to post on an issue?
also this is probably ready to come out of draft mode?
@andrewkern I thought that figure would make it to the supp material at some point?
okay sounds good
@mufernando do you mind if I push some commits with helper functions for sweepfinder? Or should I hold off for now
I think we should merge this and start separate PRs
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: nspope @.> Sent: Tuesday, January 24, 2023 10:29:32 AM To: popsim-consortium/analysis2 @.> Cc: Murillo R. @.>; Mention @.> Subject: Re: [popsim-consortium/analysis2] Sweeps simulation Snakefile (PR #79)
@mufernandohttps://github.com/mufernando do you mind if I push some commits with helper functions for sweepfinder? Or should I hold off for now
— Reply to this email directly, view it on GitHubhttps://github.com/popsim-consortium/analysis2/pull/79#issuecomment-1402400811, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACL5HFC5SVDG53N2A7LFEYTWUANQZANCNFSM6AAAAAARDQWVYU. You are receiving this because you were mentioned.Message ID: @.***>
@mufernando do you want to make any more edits to this at this point?
I think this is ready to be merged. @andrewkern got it to work independently from me. Now we need to tune the sweep parameters (the time of the sweep).
WIP