popsim-consortium / analysis2

Analysis for the second consortium paper.
8 stars 14 forks source link

Outline of work needed for sweeps analysis #104

Open nspope opened 1 year ago

nspope commented 1 year ago

Meeting with @mufernando and @andrewkern to outline what needs doing for sweeps analysis

What we want to produce:

What has to be done:

  1. diploshic training PR is reviewed, needs some minor cleanup to be merged [Andy] -- done
  2. diploshic prediction workflow needs to be put together a. dump VCF per simulated window (the 5 Mb focal region, without simulated buffer) [Murillo] -- done b. apply diploshic, sliding across focal regions -- this'll output a score per window for soft-linked/hard-linked/neutral/soft/hard classification [Andy] c. pool soft+hard scores to get a binary "sweep vs not" score [Murillo/Nate] d. take max score across entire focal window to get test statistic for the window [Murillo/Nate] e. get critical value by calculating score for neutral/BGS simulations (as for CLR) [Murillo/Nate] f. keep training and prediction in separate workflows (e.g. the prediction step should go in the same workflow where CLR is calculated) [Murillo/Nate]
  3. write rule to generate figures based off Murillo's probgen draft [Murillo]