New features are actively under construction (Fall 2024).
Contact sethtem@umich.edu or Github issues for troubleshooting.
See misc/announcements.md
for high-level updates on this repo.
See misc/fixes.md
for any major bug fixes.
See misc/usage.md
to evaluate if this methodology fits your study.
See misc/cluster-options.md
for some suggested cluster options to use in pipelines.
See on GitHub "Issues/Closed" for some comments I/Seth left about the pipeline.
Please cite if you use this package.
Temple, S.D., Waples, R.K., Browning, S.R. (2024). Modeling recent positive selection using identity-by-descent segments. The American Journal of Human Genetics. https://doi.org/10.1016/j.ajhg.2024.08.023.
Temple, S.D., Thompson, E.A. (2024). Identity-by-descent in large samples. Preprint at bioRxiv, 2024.06.05.597656. https://www.biorxiv.org/content/10.1101/2024.06.05.597656v1.
Temple, S.D. (2024). "Statistical Inference using Identity-by-Descent Segments: Perspectives on Recent Positive Selection. PhD thesis (University of Washington). https://www.proquest.com/docview/3105584569?sourcetype=Dissertations%20&%20Theses.
Acronym: incomplete Selective sweep With Extended haplotypes Estimation Procedure
This software presents methods to study recent, strong positive selection.
The methods relate lengths of IBD segments to a coalescent model under selection.
We assume 1 selected allele at a locus.
snakemake
pipeline:See misc/usage.md
.
The chromosome numbers in genetic maps should match the chromosome numbers in VCFs.
The genetic maps should be tab-separated.
This repository contains a Python package and some Snakemake bioinformatics pipelines.
src/
workflow/
You should run all snakemake
pipelines in their workflow/some-pipeline/
.
You should be in the mamba activate isweep
environment for analyses.
You should run the analyses using cluster jobs.
We have made README.md files in most subfolders.
See misc/installing-mamba.md
to get a Python package manager.
git clone https://github.com/sdtemple/isweep.git
mamba env create -f isweep-environment.yml
mamba activate isweep
python -c 'import site; print(site.getsitepackages())'
bash get-software.sh software
software/
.wget
.software/
.See workflow/other-methods/
folder for how we run methods we compare to.
This is the overall procedure. You will see more details for each step in workflow/some-pipeline/README.md
files.
Phase data w/ Beagle or Shapeit beforehand. Subset data in light of global ancestry and close relatedness.
https://github.com/sdtemple/flare-pipeline
https://github.com/YingZhou001/IBDkin
workflow/scan
).
nohup snakemake -s Snakefile-scan.smk -c1 --cluster "[options]" --jobs X --configfile *.yaml &
misc/cluster-options.md
for support.*.log
files from ibd-ends
. If it recommends an estimated err, change error rate in YAML file.workflow/scan/scripts/run-ibdne.sh
.workflow/scan/scripts/manhattan.py
.roi.tsv
file.
workflow/roi
).
nohup snakemake -s Snakefile-roi.smk -c1 --cluster "[options]" --jobs X --configfile *.yaml &
The flow chart below shows the steps ("rules") in the selection scan pipeline.
Diverting paths "mle" versus "scan" refer to different detection thresholds (3.0 and 2.0 cM).
See dag-roi.png
for the steps in the sweep modeling pipeline.