I'm trying to run CRISP for a subset of known polymorphic sites using the --bed option, and while the programme seems to accept all input files without issues, for some reason it's not being able to call any variant at all.
These are the first four lines of the tab-separated bed-file:
2L 73868 738692L 86449 864502L 462386 4623872L 466451 466452
reading fasta index file /share/rdata/ramon.pouso/reference/indexed_reference/dmel-all-chromosome-r6.14.fasta.fai ... fasta file /share/rdata/ramon.pouso/reference/indexed_reference/dmel-all-chromosome-r6.14.fasta has 1870 chromosomes/contigs
read 8040 intervals from bedfile /share/rdata/ramon.pouso/counts/sites_missing_from_gen140_fixed_in_gen140.autosomes.bed
poolsize for each sample is 160
finished reading bam file indexes
reading chromosome 2L offset 189 # targeted bases on chrom is 2448/23513712
.....processed 1000000 reads QSIZE:7774 2L:39863:39964 variants called 0
.....processed 2000000 reads QSIZE:7522 2L:77681:77782 variants called 0
.....processed 3000000 reads QSIZE:7067 2L:110616:110717 variants called 0
.....processed 4000000 reads QSIZE:6988 2L:144446:144547 variants called 0
.....processed 5000000 reads QSIZE:7783 2L:178149:178250 variants called 0
.....processed 6000000 reads QSIZE:6303 2L:214691:214792 variants called 0
.....processed 7000000 reads QSIZE:7322 2L:249779:249880 variants called 0
.....processed 8000000 reads QSIZE:6634 2L:285603:285704 variants called 0
.....processed 9000000 reads QSIZE:5725 2L:320753:320854 variants called 0
.....processed 10000000 reads QSIZE:7135 2L:368829:368930 variants called 0
Why is it not working? The target sites have been confirmed to be SNPs using CRISP in a previous whole-genome run with fewer samples.
I've tried several things (such as removing the optional columns from the .bed file, using larger intervals, I've even added .bai files for all .bam files just in case), but nothing seems to help.
The code included a filter to not output variants in low coverage regions when using bed files. This has been disabled in the latest commit. Please let me know if the problem is not resolved.
Hi,
I'm trying to run CRISP for a subset of known polymorphic sites using the --bed option, and while the programme seems to accept all input files without issues, for some reason it's not being able to call any variant at all.
These are the first four lines of the tab-separated bed-file:
2L 73868 73869
2L 86449 86450
2L 462386 462387
2L 466451 466452
And these are the first few lines of the .log:
CRISP options: QVoffset 33 min_base_quality 13 min_mapping_score 20 max_permutations 20000 poolsize 160 CT-pvalue-thresh -3.5 QV-pvalue-thresh -5.0
reading fasta index file /share/rdata/ramon.pouso/reference/indexed_reference/dmel-all-chromosome-r6.14.fasta.fai ... fasta file /share/rdata/ramon.pouso/reference/indexed_reference/dmel-all-chromosome-r6.14.fasta has 1870 chromosomes/contigs
read 8040 intervals from bedfile /share/rdata/ramon.pouso/counts/sites_missing_from_gen140_fixed_in_gen140.autosomes.bed
processing 44 bamfiles: /share/rdata/ramon.pouso/POOLS/gen0-40/gen0-40/4separate/multi_autosomic/BT20_new_gen20_bwa_iso_rmdup_mq30_rg_multirealigned_autosomic.bam ..... /share/rdata/ramon.pouso/POOLS/gen140/gen140/5processed/autosomic_isolated/sample19_gen140_bwa_iso_rmdup_mq30_rg_multirealigned_autosomic.bam
poolsize for each sample is 160 finished reading bam file indexes reading chromosome 2L offset 189 # targeted bases on chrom is 2448/23513712 .....processed 1000000 reads QSIZE:7774 2L:39863:39964 variants called 0 .....processed 2000000 reads QSIZE:7522 2L:77681:77782 variants called 0 .....processed 3000000 reads QSIZE:7067 2L:110616:110717 variants called 0 .....processed 4000000 reads QSIZE:6988 2L:144446:144547 variants called 0 .....processed 5000000 reads QSIZE:7783 2L:178149:178250 variants called 0 .....processed 6000000 reads QSIZE:6303 2L:214691:214792 variants called 0 .....processed 7000000 reads QSIZE:7322 2L:249779:249880 variants called 0 .....processed 8000000 reads QSIZE:6634 2L:285603:285704 variants called 0 .....processed 9000000 reads QSIZE:5725 2L:320753:320854 variants called 0 .....processed 10000000 reads QSIZE:7135 2L:368829:368930 variants called 0
Why is it not working? The target sites have been confirmed to be SNPs using CRISP in a previous whole-genome run with fewer samples.
I've tried several things (such as removing the optional columns from the .bed file, using larger intervals, I've even added .bai files for all .bam files just in case), but nothing seems to help.
Thank you very much in advance.
Best,
Daniel.