KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
110 stars 27 forks source link

IndexError: index out of range when annotating vcf file #230

Open MarvinDo opened 3 months ago

MarvinDo commented 3 months ago

Hello everyone,

i am using open-cravat to annotate vcf files with fathmm_xf, vest4 and chasmplus scores. Unfortunately, I got an "IndexError" for one specific variant. I appreciate any help in solving this.

Here is my input vcf file:

##fileformat=VCFv4.2
##fileDate=2024-03-26
##reference=GRCh38
##contig=<ID=chr17>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
chr17   40628784    .   CTTATTCT    C   .   .   .

Here is my log that also contains the error message:

cravat               /path/to/.venv/bin/oc run /tmp/798d503f-0886-4657-a006-db817ea70d42_18462.vcf --cleanrun -l hg38 -a fathmm_xf vest chasmplus
2024/03/26 13:09:09 cravat               started: Tue Mar 26 13:09:09 2024
2024/03/26 13:09:09 cravat               input files: /tmp/798d503f-0886-4657-a006-db817ea70d42_18462.vcf
2024/03/26 13:09:09 cravat               input assembly: hg38
2024/03/26 13:09:09 cravat               version: open-cravat 2.3.0 /path/to/.venv/lib/python3.8/site-packages/cravat
2024/03/26 13:09:09 cravat               version: fathmm_xf 1.0.2 /path/to/.venv/lib/python3.8/site-packages/cravat/modules/annotators/fathmm_xf
2024/03/26 13:09:09 cravat               version: vest 4.4.0 /path/to/.venv/lib/python3.8/site-packages/cravat/modules/annotators/vest
2024/03/26 13:09:09 cravat               version: chasmplus 1.3.0 /path/to/.venv/lib/python3.8/site-packages/cravat/modules/annotators/chasmplus
2024/03/26 13:09:09 cravat               version: hg38 1.10.4 /path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38
2024/03/26 13:09:09 cravat.converter     started: Tue Mar 26 13:09:09 2024
2024/03/26 13:09:09 cravat.converter     Input file(s): /tmp/798d503f-0886-4657-a006-db817ea70d42_18462.vcf
2024/03/26 13:09:10 cravat.converter     input format: vcf
2024/03/26 13:09:10 cravat.converter     error lines: 0
2024/03/26 13:09:10 cravat.converter     finished: Tue Mar 26 13:09:10 2024
2024/03/26 13:09:10 cravat.converter     num input lines: 1
2024/03/26 13:09:10 cravat.converter     runtime: 0.148
2024/03/26 13:09:10 cravat               num_workers: 39
2024/03/26 13:09:10 cravat               input line chunksize=1 total number of input lines=1 number of chunks=2
2024/03/26 13:09:10 cravat.mapper        input file: /tmp/798d503f-0886-4657-a006-db817ea70d42_18462.vcf.crv
2024/03/26 13:09:10 cravat.mapper        input file: /tmp/798d503f-0886-4657-a006-db817ea70d42_18462.vcf.crv
2024/03/26 13:09:11 cravat.mapper        mapper database: /path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/data/gene_33_10000.sqlite
2024/03/26 13:09:11 cravat.mapper        mapper database: /path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/data/gene_33_10000.sqlite
2024/03/26 13:09:12 cravat.mapper        started: Tue Mar 26 13:09:12 2024 | 0
2024/03/26 13:09:12 cravat.mapper        Traceback (most recent call last):
  File "/path/to/.venv/lib/python3.8/site-packages/cravat/base_mapper.py", line 307, in run_as_slave
    crx_data = self.map(crv_data)
  File "/path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/hg38.py", line 1082, in map
    so, achange, cchange, coding = self._get_del_map_data(
  File "/path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/hg38.py", line 1979, in _get_del_map_data
    so, achange = self._get_del_cds_cds_data(
  File "/path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/hg38.py", line 4506, in _get_del_cds_cds_data
    new_ref_codon = ref_codon_start[:2] + _get_bases_tpos(tid, tpos_end + 1)
  File "/path/to/.venv/lib/python3.8/site-packages/cravat/modules/mappers/hg38/hg38.py", line 4787, in _get_bases_tpos
    basebits = (seq[seqbyteno] >> (6 - seqbitno)) & 0b00000011
IndexError: index out of range
2024/03/26 13:09:12 cravat.mapper        finished: Tue Mar 26 13:09:12 2024 | 0
2024/03/26 13:09:12 cravat.mapper        runtime:  0.014
2024/03/26 13:09:12 cravat.mapper        started: Tue Mar 26 13:09:12 2024 | 1505
2024/03/26 13:09:12 cravat.mapper        finished: Tue Mar 26 13:09:12 2024 | 1505
2024/03/26 13:09:12 cravat.mapper        runtime:  0.001
2024/03/26 13:09:12 cravat               num_workers: 39
2024/03/26 13:09:13 cravat.fathmm_xf     started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.fathmm_xf     finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.fathmm_xf     runtime: 0.004s
2024/03/26 13:09:13 cravat.fathmm_xf     started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.fathmm_xf     finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.fathmm_xf     runtime: 0.002s
2024/03/26 13:09:13 cravat.vest          started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.chasmplus     started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.vest          finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.vest          runtime: 0.097s
2024/03/26 13:09:13 cravat.vest          started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.chasmplus     finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.chasmplus     runtime: 0.096s
2024/03/26 13:09:13 cravat.chasmplus     started: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.vest          finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.vest          runtime: 0.014s
2024/03/26 13:09:13 cravat.chasmplus     finished: Tue Mar 26 13:09:13 2024
2024/03/26 13:09:13 cravat.chasmplus     runtime: 0.032s
2024/03/26 13:09:14 cravat.aggregator    level: variant
2024/03/26 13:09:14 cravat.aggregator    input directory: /tmp
2024/03/26 13:09:14 cravat.aggregator    started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    runtime: 0.003
2024/03/26 13:09:14 cravat.aggregator    level: gene
2024/03/26 13:09:14 cravat.aggregator    input directory: /tmp
2024/03/26 13:09:14 cravat.aggregator    started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    runtime: 0.001
2024/03/26 13:09:14 cravat.aggregator    level: sample
2024/03/26 13:09:14 cravat.aggregator    input directory: /tmp
2024/03/26 13:09:14 cravat.aggregator    started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    runtime: 0.002
2024/03/26 13:09:14 cravat.aggregator    level: mapping
2024/03/26 13:09:14 cravat.aggregator    input directory: /tmp
2024/03/26 13:09:14 cravat.aggregator    started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.aggregator    runtime: 0.002
2024/03/26 13:09:14 cravat.tagsampler    started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.tagsampler    finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.tagsampler    runtime: 0.003
2024/03/26 13:09:14 cravat.vcfinfo       started: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.vcfinfo       finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat.vcfinfo       runtime: 0.005
2024/03/26 13:09:14 cravat               finished: Tue Mar 26 13:09:14 2024
2024/03/26 13:09:14 cravat               runtime: 5.429s
kmoad commented 3 months ago

We see the same issue. We are working on it and will get back to you once we know more.