Closed mxtu97 closed 6 months ago
Hi @mxtu97 ,
I need more information, there is no "variety" field in the VCF format, what did you change in your VCF?. Anyway, the error is not related to that, I guess you have some malformed genotypes. If you can share a few thousand lines from your VCF I might be able to help...
Edgardo
Yeah, of course. This is the first few lines of the VCF file, and the variety names are not listed here completely.
1 65427 1_65427 2 1 . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/1 0/0 0/0 1 110988 1_110988 2 1 . . PR GT 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/1 0/0 1/1 0/0 1/1 0/0 0/0 0/0 0/0 0/0 1 124600 1_124600 2 1 . . PR GT 0/0 0/0 0/0 1/1 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/1 0/0 1 124713 1_124713 2 1 . . PR GT 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/1 0/0 0/0 0/0 0/1 0/0 1 124751 1_124751 2 1 . . PR GT 0/0 0/0 0/1 1/1 0/1 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/1 1/1 0/0 0/0 0/1 1/1 1/1 1 124781 1_124781 2 1 . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/1 0/0 1 124787 1_124787 2 1 . . PR GT 0/0 0/0 0/1 1/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/1 0/0 1 125023 1_125023 2 1 . . PR GT 0/0 0/0 1/1 1/1 0/1 0/0 0/0 0/0 0/1 1/1 0/0 0/0 0/0 0/0 0/0 0/0 1/1 0/0 0/0 0/1 1/1 0/0
As I suspected your genotypes have a non-standard format, fields 3 and 4 (REF and ALT) must be nucleotides and not numbers, see here:
https://en.m.wikipedia.org/wiki/File:Binary_BCF_versus_VCF_format.png
Interesting, how did you generate this VCF?
Edgardo
The problem was solved, indeed because of the problem of genotypes. Thank.
Great, don't hesitate to ask if you find a new issue...
Edgardo
I have modified the variety names to be four digits, such as 4111, ensuring they are within 10 characters. However, I am still encountering this error. Could you please teach me how to resolve it?
$ python3 vcf2phylip-2.8/vcf2phylip.py -i test.vcf
Converting file 'test.vcf':
Number of samples in VCF: 290 Traceback (most recent call last): File "vcf2phylip-2.8/vcf2phylip.py", line 502, in
main()
File "vcf2phylip-2.8/vcf2phylip.py", line 316, in main
site_tmp = get_matrix_column(record, num_samples,
File "vcf2phylip-2.8/vcf2phylip.py", line 129, in get_matrix_column
column += AMBIG[geno_nuc]
KeyError: '2'