EBIvariation / vcf-validator

Validation suite for Variant Call Format (VCF) files, implemented using C++11
Apache License 2.0
129 stars 39 forks source link

T2D-268 Add support for retrieving reference sequence from ENA #176

Closed Zhicheng-Liu closed 5 years ago

Zhicheng-Liu commented 5 years ago

With this change, the assembly checker can access compressed fasta, fasta files in a remote location and download contig reference sequence on demand from ENA API.

Zhicheng-Liu commented 5 years ago

@jmmut I have fixed the issues you mentioned above:

* when only the vcf is passed to the program (`vcf_assembly_checker -i my.vcf`):

  * empty files are created if the contig is not found in ENA
  * segmentation fault occurs, leaving the files in the folder. (segfault freeing the IFasta in the main function).
  * when no `##reference` nor `##contig` metadata rows are present, No RemoteContig download is triggered, and a segfault occurs

This was introduced with the new implementation. I did not test it properly. This should have now been fixed by 8ef6838. A test case is also added to cover this scenario.

* when a remote contig is downloaded, before writing to a file it is loaded in memory in a stringstream

Good catch. My bad again. The new implementation was supposed to be fixing such issue. There were starts and stops, so I must have forgotten something at some point. Anyway it is now fixed in e5ceed1.