Open nh13 opened 4 years ago
hmm. yeah. this needs to be addressed here: https://github.com/brentp/cyvcf2/blob/2dd88453ff4ec619311f860a2df4651f8b2ccf04/cyvcf2/cyvcf2.pyx#L572-L577
htslib returns -1 from bcf_read in this case, and prints a message. cyvcf2 currently just stops iteration.
@brentp how about a param to cyvcf2
to either 'stop', 'keep going', or 'throw an exception'? Similar to picard's strict, lenient, or silent.
So it looks like the way to handle this in cyvcf2 is to check the errcode which gets set when there is a problem.
It is set to one of these possible values AFAICT: https://github.com/samtools/htslib/blob/4162046b28a7d9d8a104ce28086d9467cc05c212/htslib/vcf.h#L193-L199
I think we can ignore invalid contig and raise exception on others.
i think it's simpler for the user if it fails on a < 0 error except those related to undefined contigs.
closed in #162
Should this be closed here?
@nh13 can you add a very simple fail.vcf to the tests and check that it fails correctly?
@dbolser would you like to reopen this issue, and assign it to me?
@dbolser would you like to reopen this issue, and assign it to me?
Hi @nh13, thanks for getting back to me on this.
I'm not actually a project admin, I just like opensource software community development, so I don't feel shy bossing people about ;-)
@brentp, can you reopen this issue and assign it to @nh13 please?
@nh13, please let us know if you have any problem figuring out how to add the test into the existing framework and how to submit a PR.
Cheers both!
Thanks, I’m unlikely to get to this anytime soon, so if others want to add the test VCF before I do, I won’t be offended.
If the
PS
format field has typeInteger
but the values are strings, thencyvcf2
will skip the variant but not throw an exception. I would have expected, by default, that the underlying error to propagate up as an exception. I'd be ok with skipping the variant if I had set some option to be "lenient". If I change the type toString
all works fine, but the silent error did cause some strange downstream results.You can see the VCF for the test above here:
`fail.vcf`
``` ##fileformat=VCFv4.2 ##CL=vcffilter -i filtered-phase-transfer.vcf.gz -o - --javascript "ensureFormatHeader(\"##FORMAT=You can see the original GRCh38 NA12878 VCF below. You'll see that the type for the
PS
format field is set asString
, which is not valid according to the VCF spec (sigh). I'll have to report that some where. But even so, I'd expected the parsing error to be propagated.Getting the original VCF
```bash bcftools view https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/HG001_GRCh38_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X-SOLID_CHROM1-X_v.3.3.2_highconf_PGandRTGphasetransfer.vcf.gz chr1:4001310-4001310 | grep PS ##CL=vcffilter -i filtered-phase-transfer.vcf.gz -o - --javascript "ensureFormatHeader(\"##FORMAT=