Illumina / strelka

Strelka2 germline and somatic small variant caller
GNU General Public License v3.0
355 stars 102 forks source link

Failing to index VCF file #229

Open acboulet opened 1 year ago

acboulet commented 1 year ago

I'm running Strelka to call variants in a plant genome, and receiving an error when Strelka tries to index the final VCF file. I'm not surprised as the BAM indexes require CSI formatting due to the plant genome. However, I'm unsure how to configure Strelka to allow for this option. Is there a way to change the tabix indexing or even prevent this step?

Below I've included a section of the my log with some directory information replaced.

Any advice on how to handle the VCF indexing would be greatly appreciated.

[2023-05-10T00:19:07.181336Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] Failed to complete sub-workflow task: 'CallGenome' launched from master workflow, failed sub-workflow classname: 'CallWorkflow'
[2023-05-10T00:19:07.183459Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] Failed to complete command task: 'CallGenome+gVCF_S1_index_vcf' launched from sub-workflow 'CallGenome', error code: 1, command: '<installdir>/strelka/2.9.10/libexec/tabix -p vcf <rundir>/results/variants/genome.S1.vcf.gz'
[2023-05-10T00:19:07.185585Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] [CallGenome+gVCF_S1_index_vcf] Error Message:
[2023-05-10T00:19:07.187654Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] [CallGenome+gVCF_S1_index_vcf] Last 2 stderr lines from task (of 2 total lines):
[2023-05-10T00:19:07.187654Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] [2023-05-10T00:19:01.812261Z] [<host>] [280703_1] [CallGenome+gVCF_S1_index_vcf] [E::hts_idx_push] Region 536870892..536870960 cannot be stored in a tbi index. Try using a csi index with min_shift = 14, n_lvls >= 6
[2023-05-10T00:19:07.187654Z] [<host>] [280703_1] [WorkflowRunner] [ERROR] [2023-05-10T00:19:01.813675Z] [<host>] [280703_1] [CallGenome+gVCF_S1_index_vcf] tbx_index_build failed: <rundir>/results/variants/genome.S1.vcf.gz