karkinos is a tumor genotyper that detects single nucleotide variation (SNV) and copy number variation (CNV) and calculates tumor cellularity from tumor-normal paired sequencing data.
Accurate CNV calling is achieved using continuous wavelet analysis and multi-state HMM, while SNV call is adjusted by tumor cellularity and filtered by a heuristic filtering algorithm and Fisher Test. Also, Noise calls in low depth regions are removed using the EM algorithm.
Copyright (C) 2014 Hiroki Ueda Rcast, the University of Tokyo
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
$ git clone https://github.com/genome-rcast/karkinos.git
$ cd karkinos
$ ./gradlew uberjar
You don't need to install Gradle command.
karkinos-standalone-X.Y.Z-SNAPSHOT.jar
is created in the ./build/libs/
directory.
karkinos.property
dbSNP file format is as follows:
e.g.
585 chr1 10468 10469 rs117577454 0 + C C C/G genomic single by-1000genomes 0 0 unknown exact 1 1 1000GENOMES, 2 G,C, 18.000000,102.000000, 0.150000,0.850000,
The current version of karkinos supports only one subcommand, analysis
.
This subcommand will pileup reads and then analyze SNVs, CNVs, and Tumor purity.
Usage: java -jar karkinos.jar analysis -n <arg> -t <arg> -r <arg> -snp <arg> -ct
<arg> -o <arg> -id <arg> [-prop <arg>] [-mp <arg>] [-g1000 <arg>]
[-cosmic <arg>] [-g1000freq <arg>] [-chr <arg>] [-rs <arg>] [-rg
<arg>] [-exonSNP <arg>] [-nopdf]
-n,--normalBam <arg> normal BAM file
-t,--tumorBam <arg> tumor BAM file
-r,--reference <arg> reference genome file of 2bit format
-snp,--dbSNP <arg> dbSNP list (e.g. bin, chr, start, end)
-ct,--captureTarget <arg> BED file of capture target regions
-o,--outdir <arg> output directory
-id,--uniqueid <arg> unique id for this sample
-prop,--property <arg> karkinos.property file
-mp,--mappability <arg> (optional) Big Wig format file of mappability from UCSC
-g1000,--1000genome <arg> (optional) 1000 genome list (e.g. chr, pos, ref, alt, freq, id)
-cosmic,--cosmicSNV <arg> VCF format file of COSMIC's SNV
-g1000freq,--1000genomefreq <arg> (optional) threshold of 1000 genome frequency
-chr,--chrom <arg> chromosome name
-rs,--readsStats <arg> (optional) reads stats files (normal and tumor)
-rg,--refFlatGenes <arg> (optional) gene references
-exonSNP,--exonSNP <arg> (optional) additional exon SNPs
-nopdf,--nopdf if you don't need a graphical summary PDF