sgkit-dev / vcztools

Partial reimplementation of bcftools for VCF Zarr
Apache License 2.0
4 stars 3 forks source link

Roundtrip testing with hypothesis-vcf #31

Open tomwhite opened 4 months ago

tomwhite commented 4 months ago

We can use hypothesis-vcf to generate VCF files, convert to Zarr with vcf2zarr, then convert back using vcztools - and check that they are equivalent.

tomwhite commented 4 months ago

I tried this in a branch here: https://github.com/tomwhite/vcztools/tree/roundtrip-hypothesis. It found a failing case immediately:

>       encoder = _vcztools.VcfEncoder(
            num_variants,
            num_samples if num_samples is not None else 0,
            chrom=chrom,
            pos=pos,
            id=id,
            alt=alt,
            ref=ref,
            qual=qual,
            filter_ids=filters,
            filter=filter_,
        )
E       ValueError: Error occured: -204: 
E       Falsifying example: test_vcf_to_zarr_to_vcf__hypothesis_generated_vcf(
E           tmp_path=PosixPath('/private/var/folders/9j/h1v35g4166z6zt816fq7wymc0000gn/T/pytest-of-tom/pytest-136/test_vcf_to_zarr_to_vcf__hypot0'),
E           vcf_string='##fileformat=VCFv4.3\n##FILTER=<ID=PASS,Description="All filters passed">\n##source=hypothesis-vcf-0.1.dev6+g401c202\n##contig=<ID=0>\n#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\n0\t1\t.\tA\t.\t.\t.\t.\n',
E       )
E       vcf:
E       ##fileformat=VCFv4.3
E       ##FILTER=<ID=PASS,Description="All filters passed">
E       ##source=hypothesis-vcf-0.1.dev6+g401c202
E       ##contig=<ID=0>
E       #CHROM  POS ID  REF ALT QUAL    FILTER  INFO
E       0   1   .   A   .   .   .   .

vcztools/vcf_writer.py:257: ValueError

FWIW it found failures using numba too, but it's probably not worth trying to get both working.