williamslab / ibis

Algorithm for rapid, phase-free detection of long identical by descent segments
GNU General Public License v3.0
17 stars 7 forks source link

IBIS

IBIS is a fast IBD Segment calling algorithm aimed at large, unphased genetic datasets.

Compiling IBIS

First, ensure zlib, including developmental headers, is installed (e.g., the zlib1g-dev package on Ubuntu).

Next, clone the repository by running

git clone --recurse-submodules https://github.com/williamslab/ibis.git

(Alternatively, git clone [repo] followed by git submodule update --init in the cloned directory will do the same as the above.)

Now, compile by running

make

in the repository directory (i.e., run cd ibis then make).

To pull IBIS updates, use

git pull
git submodule update --remote

Steps for running IBIS

  1. Convert input data into PLINK binary format (.bed, .bim, and .fam).

    • Running PLINK with --make-bed enables conversion from many other forms of genetic data into this file format.
  2. Insert a genetic map into the bim file using add-map-plink.pl.

    • A good human genetic map is the HapMap II map. As of this writing, the latest version for build 37 is available here.
    • Example add-map-plink.pl command using bash:
      ./add-map-plink.pl my.bim [map directory]/genetic_map_GRCh37_chr{1..22}.txt > new.bim
    • this uses my.bim to create a file new.bim with the genetic map inserted. (For non-bash environments, supply the file for each chromosome after the bim file: ./add-map-plink.pl my.bim genetic_map_GRCh37_chr1.txt genetic_map_GRCh37_chr2.txt genetic_map_GRCh37_chr3.txt ....)
    • NOT RECOMMENDED: If you do not add a genetic map to the bim file, and you have 0 for all genetic positions in the input, IBIS will use a genetic map based on the physical positions in the input, treating 1Mb as 1cM.
  3. Run IBIS using the specifications described below.

IBIS Usage

IBIS accepts its input .bed, .bim, and .fam files in one of two ways:

./ibis -bfile test1-chr1 -min_l 7 -mt 500 -er .004 -f test1Out

IBIS Options:

Execution options:

IBD2 threshold parameters: (use with -2 or -ibd2)

HBD threshold parameters: (use with -hbd)

Output controls:

Kinship and inbreeding coefficient file options:

IBIS Output

IBIS produces a .seg or .bseg file and, when using -printCoef, a .coef file. Additionally, when using -hbd, it prints a .hbd file with homozygous by descent segments, and with the addition of -printCoef, a .incoef file.

Segment file format:

sample1 sample2 chrom phys_start_pos phys_end_pos IBD_type genetic_start_pos genetic_end_pos genetic_seg_length marker_count error_count error_density

If -bin is employed, the .bseg output will not be human readable, and bseg2seg can be used to convert .bseg files to .seg format.

Any number of .bseg files can be provided for conversion. Example:

./bseg2seg test1-chr1.fam test1Out.chrom1.bseg test1Out.chrom2.bseg test1Out.chrom3.bseg ...

Coef file format:

sample1 sample2 kinship_coefficient IBD2_fraction segment_count degree_of_relatedness

HBD file format:

sample_id chrom phys_start_pos phys_end_pos HBD_type genetic_start_pos genetic_end_pos genetic_seg_length marker_count error_count error_density

Incoef file format:

sample_id inbreeding_coefficient segment_count

seg2coef

The -printCoef option requires IBIS to analyze genome-wide data, but it is possible to produce a coef file from a set of .seg or .bseg files. This would be useful when a user runs IBIS on each chromosome independently.

seg2coef takes the following options:

./seg2coef [total map length cM] [fam file] [seg/bseg files ...]

The total map length in cM is available in the .bim file. The maplen.awk script calculates this in the following way:

./maplen.awk [bim files ...]

Example seg2coef execution using data subdivided into chromosomes as data[chr].{bed,bim,fam}, with IBIS segments in output[chr].seg:

./maplen.awk data{1..22}.bim
./seg2coef [Total_length] data1.fam output{1..22}.seg > output.coef

the [Total_length] argument should be the value output from the maplen.awk command. This produces output.coef.

License

This project is licensed under the GPL-3.0 - see the LICENSE file for details