Closed j2moreno closed 4 years ago
GRCh37 vcf file has chromosomes listed as:
$ bcftools query -f '%ID %CHROM %POS\n' tmp/data/grch37_vcf-03-11-2020/GCF_000001405.25.gz | head
rs775809821 NC_000001.10 10019
rs1008829651 NC_000001.10 10043
Fields needed so that snptk can probably read file:
bcftools query -f '%ID %CHROM %POS\n' <input_file>
will be used to extract necessary fields
GRCh37 VCF chromosomes are not given correctly.
$ zcat tmp/data/dbsnp-GRCh37.gz | head
rs775809821 NC_000001.10 10019
rs978760828 NC_000001.10 10039
rs1008829651 NC_000001.10 10043
rs1052373574 NC_000001.10 10051
rs1326880612 NC_000001.10 10051
rs768019142 NC_000001.10 10055
Additional script needed to map snps to correct chromosome
https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.25.gz