bioinformatics-centre / BayesTyper

A method for variant graph genotyping based on exact alignment of k-mers
86 stars 7 forks source link

Error during combine of Manta SV sites #26

Open jjfarrell opened 4 years ago

jjfarrell commented 4 years ago

When running combine, the following error occurs. Any suggestions?

[09/03/2020 23:18:18] You are using BayesTyperTools (v1.5)

[09/03/2020 23:18:18] Running BayesTyperTools (v1.5) combine on 2 files ...

[09/03/2020 23:19:28] Finished chromosome chr1
[09/03/2020 23:20:43] Finished chromosome chr2
[09/03/2020 23:21:46] Finished chromosome chr3
[09/03/2020 23:22:46] Finished chromosome chr4
[09/03/2020 23:23:43] Finished chromosome chr5
[09/03/2020 23:24:38] Finished chromosome chr6
[09/03/2020 23:25:32] Finished chromosome chr7
[09/03/2020 23:26:20] Finished chromosome chr8
[09/03/2020 23:27:03] Finished chromosome chr9
[09/03/2020 23:27:46] Finished chromosome chr10
[09/03/2020 23:28:30] Finished chromosome chr11
[09/03/2020 23:29:10] Finished chromosome chr12
[09/03/2020 23:29:40] Finished chromosome chr13
[09/03/2020 23:30:08] Finished chromosome chr14
[09/03/2020 23:30:36] Finished chromosome chr15
[09/03/2020 23:31:07] Finished chromosome chr16
[09/03/2020 23:31:36] Finished chromosome chr17
[09/03/2020 23:32:03] Finished chromosome chr18
[09/03/2020 23:32:25] Finished chromosome chr19
[09/03/2020 23:32:44] Finished chromosome chr20
[09/03/2020 23:32:56] Finished chromosome chr21
bayesTyperTools: /isdata/kroghgrp/jasi/bayesTyper/code/releases/v1.5_static/BayesTyper-1.5/src/vcf++/VcfFile.cpp:146: void VcfFileReader::updateVariantLine(): Assertion `getline(vcf_infile_fstream, cur_var_line.at(7), '\n')' failed.
./run_combine.sh: line 5: 25393 Aborted                 ../bin/bayesTyperTools combine -o SNP_dbSNP150common_SV_1000g_dbSNP150all_GDK_GoNL_GTEx_GRCh38.$1 -z -v $VCF_LIST,$PRIOR_VCF
jonassibbesen commented 4 years ago

This seems to be an issue with parsing the VCF file. I do not expect it to be the prior since we have used it many times before without any problems. One possible issue that could raise this error is if not all lines in your vcf has the same number of columns or if the file does not end with a newline character. If this is not the issue would it be possible for you to share the file with me?

jjfarrell commented 4 years ago

@jonassibbesen I have found the one variant that is triggering this error and created a vcf that I can share. let me know how to get it to you.

iprada commented 4 years ago

Hi @jjfarrell. I have run into a similar issue. Could you share the source of the error so that I handle it? Thanks!

jjfarrell commented 3 years ago

@lprada and @jonassibbesen I eventually found the source of the error. I typically use bgzip to compress vcf files. That allows the file to be indexed. For the candidate vcf input to bayestyper, it must be gzipped. Once I made this change, I never got the error again. There must be a slight difference in the compression that bayestyper can not handle. It would be nice to have bayestyper read bgzip files to allow indexing with tabix.

jonassibbesen commented 3 years ago

Glad to hear you found the issue. Yeah, it is a known issue that bgzip does does not work with the boost compression library that we use. It should be possible to support bgzip using htslib. However, I unfortunately do not have time to work on this in the near future.