NCGG-MGC / IMSindel

IMSindel: An accurate intermediate-size indel detection tool incorporating de novo assembly and gapped global-local alignment with split read analysis
https://www.nature.com/articles/s41598-018-23978-z
MIT License
15 stars 0 forks source link

Insertions and deletions (indels) have been implicated in dozens of human diseases through the radical alteration of gene function by short frameshift indels as well as long indels. However, the accurate detection of these indels from next-generation sequencing data is still challenging. This is particularly true for intermediate-size indels (≥50 bp), due to the short DNA sequencing reads. Here, we developed a new method that predicts intermediate-size indels using BWA soft-clipped fragments (unmatched fragments in partially mapped reads) and unmapped reads(Fig.01).

Figure01 Shigemizu

Reference

Daichi Shigemizu, Fuyuki Miya, Shintaro Akiyama, Shujiro Okuda, Keith A Boroevich, Akihiro Fujimoto, Hidewaki Nakagawa, Kouichi Ozaki, Shumpei Niida, Yonehiro Kanemura, Nobuhiko Okamoto, Shinji Saitoh, Mitsuhiro Kato, Mami Yamasaki, Tatsuo Matsunaga, Hideki Mutai, Kenjiro Kosaki & Tatsuhiko Tsunoda, IMSindel: An accurate intermediate-size indel detection tool incorporating de novo assembly and gapped global-local alignment with split read analysis, Scientific Reports volume 8, Article number: 5608 (2018)

depend tools

usage

$ bin/imsindel --bam foo.bam --chr 1 --outd out --indelsize 10000 --reffa ref.fa

run on docker

build image

$ git clone https://github.com/NCGG-MGC/IMSindel.git
$ cd IMSindel
$ docker build -t imsindel .

run imsindel

$ mkdir /path/to/data
$ mv /path/to/your.bam /path/to/data/
$ samtools index /path/to/data/your.bam
$ mv /path/to/ref.fa /path/to/data/
$ docker run --rm -v /path/to/data:/data imsindel --bam /data/your.bam --chr 1 --outd /data --indelsize 10000 --reffa /data/ref.fa

options

output

|column|description

-----|------|----------- 1|indel_type|DEL=deletion, INS=insertion 2|call_type|Hete=heterozygous indel(0.15<#indel_depth/#ttl_depth<=0.7), Homo=homozygous indel(#indel_depth/#ttl_depth>0.7) 3|chr|chromosome number 4|sttpos|indel’s start position 5|endpos|indel’s end position 6|indel_length|indel size 7|indel_str|indel sequence 8|#indel_depth|read count including indels 9|#ttl_depth|total read count 10|details(indelcall_indeltype_depth)|composed of four components;
1. Indel_type
2. LI=long insertion, ULI=uncomplete long insertion, LD=long deletion, B: clipped fragments on the right side of read sequences, F: clipped fragments on the left side of read sequences, SI: short indel,
3. #indel_depth,
4. clip_sttpos 11|clip_sttpos|clipped fragments’ start position 12|depth(>=10)|High if #total depth >=10

update