Open ChristianCortes opened 7 years ago
I've also ran into this issue when trying to extract SNPs from the latest (v92) VCF file for mouse variation from ENSEMBL.
The file may be downloaded from: ftp://ftp.ensembl.org/pub/release-92/variation/vcf/mus_musculus/mus_musculus.vcf.gz
I've also ran into too. Is this error fixed ?
Hi,
I am trying to extract SNPs using hisat2_extract_snps_haplotypes_VCF.py script from Ensembl VCF file (ftp://ftp.ensembl.org/pub/release-87/variation/vcf/danio_rerio/Danio_rerio.vcf.gz) to build a zebrafish index for GRCz10_GCA_000002035.3 assembly (ftp://ftp.ensembl.org/pub/release-87/fasta/danio_rerio/dna/Danio_rerio.GRCz10.dna.toplevel.fa.gz).
I run: hisat2_extract_snps_haplotypes_VCF.py -v Danio_rerio.GRCz10.dna.toplevel.fa Danio_rerio.vcf Danio_rerio.GRCz10.87
Issue: The script run some time and STOP. I got the following error. Traceback (most recent call last): File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 892, in
args.verbose)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 730, in main
genotypes)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 688, in add_vars
tmp_vars = extract_vars(chr_dic, chr, pos, ref_allele, alt_alleles, varID)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 103, in extract_vars
assert min_len >= 1
AssertionError
Note: The output file has only chr1 information
Also, I've tested dbSNP files from NCBI (ftp://ftp.ncbi.nih.gov/snp/organisms/zebrafish_7955/VCF/) and Zv9 (GCA_000002035.2, ftp://ftp.ensembl.org/pub/release-79/fasta/danio_rerio/dna/) I got the same problem. For example Danio_rerio.Zv9.chromosome.1.fa.gz and vcf_chr_1-vcf.gz, I got a similar error.
Traceback (most recent call last): File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 892, in
args.verbose)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 730, in main
genotypes)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 688, in add_vars
tmp_vars = extract_vars(chr_dic, chr, pos, ref_allele, alt_alleles, varID)
File "/Users/XRIS/bin/hisat2_extract_snps_haplotypes_VCF.py", line 113, in extract_vars
assert ref_allele2 != alt_allele
AssertionError
This error only happens with some chr vcf files.
Any help will be welcome,
Thanks in advance,
Christian