Closed virenar closed 3 years ago
I don't see any error, these are compiler warnings. The binary should be there to run.
Thank you for your quick response. I do see the binary in the bin folder CRISP.binary
but when I try to execute the tool it keeps displaying the default help message. I know i might be doing something really foolish.
(base) ubuntu@earth:~/git/crisp/bin$ ./CRISP.binary --bam H0-0-S001_sort.bam --ref human_g1k_v37_decoy.fasta --bed target_regions.bed -p 8 --VCF test.vcf > variantcalls.log
CRISP: statistical method to identify SNVs and indels from pooled DNA sequencing data (requires multiple samples (pools), ideally >=5 samples)
Please provide a bed-file for calling variants on human samples to avoid large output files
./CRISP [options] --bams file_bam_paths --ref reference.fasta --VCF variantcalls.VCF -p poolsize > variantcalls.log
Options:
--bams textfile with list of bam file paths (one for each pool)
--bam bam file for one pool, specify file for each pool using --bam pool1.bam --bam pool2.bam .... --bam pooln.bam
--ref Indexed Reference Sequence file (fasta)
--bed bed file for list of regions in which variants should be called (format is chrom start end on each line)
-p/--poolsize <int> poolsize (number of haploid genomes in each pool), for diploid genomes: 2 x # individuals
--VCF VCF file to which the variant calls will be output
--qvoffset <int> quality value offset, 33 is default
--mbq <int> minimum base quality to consider a base for variant calling, default 10
--mmq <int> minimum read mapping quality to consider a read for variant calling, default 20
--regions region(s) in which variants will be called, e.g chr1:654432-763332. BAM files should be indexed for using this option.
--minc <int> minimum number of reads with alternate allele required for calling a variant, default 4
--ctpval <float> threshold on the contingency table p-value for calling position as variant (specified as log10), default is -3.5
--qvpval <float> threshold on the quality values based p-value for calling position as variant (specified as log10), default is -5
--perms <int> maximum number of permutations for calculating contingency table p-value, default 20000
--filterreads <0/1> filter reads with excessive number of mismatches (and gaps) compared to the reference sequence, default is 1. Set to 0 to disable filtering
--verbose <0/1/2> amount of information to output to log file, 0: no output, 1: medium (default), 2: detailed
--OPE <0/1> identify overlapping paired-end reads and treat as single read in the overlapping region, default is 1, set to 0 to disable this (can be slow for high-coverage datasets)
--refbias <float> reference allele bias for targeted sequencing data, default is 0.5, use 0.52-0.54 for Agilent SureSelect targeted sequencing experiments
--EM <0/1> 0 = old CRISP method with allele frequencies, 1 = EM algorithm will be used for estimating pooled genotypes and calling variants, default is 1
--flankingbases <int> call variants in regions that flank target regions in bed file, use 50 or 100 for targeted sequencing, default value is 0
Notes:
1. CRISP requires poolsize and reference fasta file for making variant calls
2. CRISP requires at least two pools to make variant calls, but at least 5 pools are ideal
3. The reference sequence file should be indexed using 'samtools faidx' or a similar program and placed in same directory as fasta file with extension .fai
4. The ploidy of each pool is assumed to be the same, see the FAQ file for how to specify variable ploidy for each pool
5. For human re-sequencing, if a bedfile is not specified, the program will evaluate each base for variant calling and the output log file can be huge
6. Please make sure that the reference sequence file is the same as the one used to align the reads in the BAM files and that the BAM files are coordinate sorted
7. For indel analysis, CRISP assumes that indels are left justified, --leftalign 1 option can be used to left justify gaps in aligned reads
8. the program uses the samtools API for reading bam files
9. BAM files should be indexed in order to use the --regions option with the indexed bam file as pooln.bam.bai
10. The aligned reads for each pool should be in a single bam file that is sorted by chromosomal coordinates
It requires multiple samples (bam files) for variant calling. Please see the Notes and readme.
I have an issue installing the CRISP in ubuntu 16.04.5 LTS. I get the following message when I execute
make all
in the main directory