IBIS is a fast IBD Segment calling algorithm aimed at large, unphased genetic datasets.
First, ensure zlib, including developmental headers, is installed (e.g., the zlib1g-dev
package on Ubuntu).
Next, clone the repository by running
git clone --recurse-submodules https://github.com/williamslab/ibis.git
(Alternatively, git clone [repo]
followed by git submodule update --init
in the cloned directory will do the same as the above.)
Now, compile by running
make
in the repository directory (i.e., run cd ibis
then make
).
To pull IBIS updates, use
git pull
git submodule update --remote
Convert input data into PLINK binary format (.bed, .bim, and .fam).
--make-bed
enables conversion from many other forms of genetic data into this file format.Insert a genetic map into the bim file using add-map-plink.pl.
add-map-plink.pl
command using bash:
./add-map-plink.pl my.bim [map directory]/genetic_map_GRCh37_chr{1..22}.txt > new.bim
my.bim
to create a file new.bim
with the genetic map inserted. (For non-bash environments, supply the file for each chromosome after the bim file: ./add-map-plink.pl my.bim genetic_map_GRCh37_chr1.txt genetic_map_GRCh37_chr2.txt genetic_map_GRCh37_chr3.txt ...
.)Run IBIS using the specifications described below.
IBIS accepts its input .bed, .bim, and .fam files in one of two ways:
[bed file] [bim file] [fam file]
./ibis test1-chr1.bed test1-chr1.bim test1-chr1.fam -min_l 7 -mt 500 -er .004 -f test1Out
or
-b [prefix]
or -bfile [prefix]
./ibis -bfile test1-chr1 -min_l 7 -mt 500 -er .004 -f test1Out
-2
or -ibd2
-hbd
-chr <value>
-t <value>
or -threads <value>
-noConvert
-noConvert
disables.-maxDist <value>
-setIndexStart <value>
-setIndexEnd <value>
-er <value>
or -errorRate <value>
-mL <value>
or -min_l <value>
-mt <value>
-er2 <value>
or -errorRate2 <value>
-mL2 <value>
or -min_l2 <value>
-mt2 <value>
-erH <value>
or -errorRateH <value>
-mLH <value>
or -min_lH <value>
-mtH <value>
-f <filename>
or -o <filename>
or -file <filename>
ibis
, resulting in ibis.seg, ibis.coef (if using -printCoef
), ibis.hbd (with -hbd
), and ibis.incoef (with -hbd
and -printCoef
).-bin
or -binary
-gzip
-noFamID
<fam ID>:<indiv ID>
-printCoef
-a <value>
-d <value>
or -degree <value>
-c
-c <value>
-d
IBIS produces a .seg or .bseg file and, when using -printCoef
, a .coef file. Additionally, when using -hbd
, it prints a .hbd file with homozygous by descent segments, and with the addition of -printCoef
, a .incoef file.
Segment file format:
sample1 sample2 chrom phys_start_pos phys_end_pos IBD_type genetic_start_pos genetic_end_pos genetic_seg_length marker_count error_count error_density
IBD_type
can be either IBD1 or IBD2error_count
and error_density
are negative numbers for IBD1 segments that precede IBD2 segments. The error information in them is not specifically tracked by IBIS.If -bin
is employed, the .bseg output will not be human readable, and bseg2seg can be used to convert .bseg files to .seg format.
Any number of .bseg files can be provided for conversion. Example:
./bseg2seg test1-chr1.fam test1Out.chrom1.bseg test1Out.chrom2.bseg test1Out.chrom3.bseg ...
Coef file format:
sample1 sample2 kinship_coefficient IBD2_fraction segment_count degree_of_relatedness
-a
) is included in the given kinship coefficients and for determining the degrees.HBD file format:
sample_id chrom phys_start_pos phys_end_pos HBD_type genetic_start_pos genetic_end_pos genetic_seg_length marker_count error_count error_density
HBD_type
is simply HBD
Incoef file format:
sample_id inbreeding_coefficient segment_count
The -printCoef
option requires IBIS to analyze genome-wide data, but it is possible to produce a coef file from a set of .seg or .bseg files. This would be useful when a user runs IBIS on each chromosome independently.
seg2coef
takes the following options:
./seg2coef [total map length cM] [fam file] [seg/bseg files ...]
The total map length in cM is available in the .bim file. The maplen.awk script calculates this in the following way:
./maplen.awk [bim files ...]
Example seg2coef execution using data subdivided into chromosomes as data[chr].{bed,bim,fam}
, with IBIS segments in output[chr].seg
:
./maplen.awk data{1..22}.bim
./seg2coef [Total_length] data1.fam output{1..22}.seg > output.coef
the [Total_length]
argument should be the value output from the maplen.awk
command. This produces output.coef
.
This project is licensed under the GPL-3.0 - see the LICENSE file for details