Conversion of Complete Genomics var file to gVCF
This conversion assumes that the genome file is build 37 and does not currently support other builds.
This is Python package in the Python Package Index (PyPI). You can install it with
pip install cgivar2gvcf
. This installation will also install the
twobitreader
package.
You can run this tool on the command line like this:
python -m cgivar2gvcf -h
The above command will display the program's options.
Notably, you need a copy of the UCSC 2bit reference genome to perform conversion.
The command line tool expects you to provide a directory where this file exists
(it should have the name hg19.2bit
). If it's not present, the tool with download
a copy into this directory.
An example command for a variant-only VCF file (not gVCF):
python -m cgivar2gvcf -d files/ -i var-GS00253-ASM.tsv.bz2 --var-only -o GS00253-vcf-from-var.vcf.bz2
Writes the new VCF file to the specified output destination.
convert_to_file(cgi_input, output_file, twobit_ref, twobit_name, var_only=False)
Returns a generator object that yields lines of the VCF file.
convert(cgi_input, twobit_ref, twobit_name, var_only=False)
Convenience function for finding the UCSC reference genome in a specified directory, and downloading it if it's not present.
get_reference_genome_file(refseqdir, build)