ifishlin / HAHap

A read-based haplotyping tool using hierarchical assembly
1 stars 0 forks source link

Build Status

HAHap: A read-based haplotyping method using hierarchical assembly

About

HAHap is a method to infer haplotypes using sequencing data. It attempts to eliminate the influence of noises through the process of assembly, though it remains the spirit of minimum error correction in certain conditions. We developed an adjusted multinomial probabilistic metric for evaluating the reliability of a variant pair, and the derived scores guide the assembly process.

HAHap takes BAM files as the input, and was validated using the short reads from the Illumina HiSeq platform.

Required

HAHap is a pure-python program. It requires the following packages.

Usage

Git clone and execute bin/HAHap.

git clone https://github.com/ifishlin/HAHap
cd HAHap/bin
python HAHap phase vcf bam out
usage: python HAHap phase [--mms MMS] [--lct LCT] [--minj MINJ] [--pl PL] VCF BAM OUT

positional arguments
VCF          VCF file with heterozygous variants needed to be phased
BAM          Read mapped file
OUT          VCF file with predicted haplotype. (HP tags)

optional arguments:
--mms            Minimum read mapping quality (default:0)
--lct            Threshold of low-coverage pairs (int, default:median)
--minj           Minimum junctions number (default:4)
--pl             The likelihood of P1 and P2 (default:0.49)        

Data (Ashkenazim family)

The answer set used in the real-data experiment was created by taking the intersection between (1) and (2)

Authors

Yu-Yu Lin, Pei-Lung Chen, Yen-Jen Oyang and Chien-Yu Chen. National Taiwan University, Taiwan.