Closed cassiawag closed 4 years ago
Hi @cassiawag
But a few variants will have a negative number (not negative one indicating a missing alternate) but something like -27466 or -19240. These numbers are NOT the negative version of the number of reads listed in the vcf, and I can't figure out where they're coming from or why most of the variants display the correct information.
Could this be from large numbers overflowing the data type? By default calldata/AD and calldata/DP are given an int16 data type, which means any value larger than 32767 will overflow.
Ahh, yes, of course! It's fixed when I specify a larger data type. Obvious now. Thanks for the troubleshooting help.
No problem 😄
Hello! I'm using scikit-allel version 1.2.1 on python 3.6.10. I'm reading in vcf files using
vcf = allel.read_vcf(file, 'variants/POS', 'variants/REF', 'variants/ALT', 'calldata/DP', 'calldata/RD', 'calldata/AD', 'calldata/RDF', 'calldata/RDR', 'calldata/ADF', 'calldata/ADR')
.I have one sample in the vcf and only one variant per position, so I call total read depth and variant supporting reads using
vcf['calldata/DP'][:,0]
andvcf['calldata/AD'][:,0,0]
. When comparing back to the original vcf, this calls the correct values for most variants. But a few variants will have a negative number (not negative one indicating a missing alternate) but something like -27466 or -19240. These numbers are NOT the negative version of the number of reads listed in the vcf, and I can't figure out where they're coming from or why most of the variants display the correct information.Note: I haven't identified this behavior with any other calldata fields including
'calldata/RD'
,'calldata/RDF'
,'calldata/RDR'
,'calldata/ADF'
, and'calldata/ADR'
. I've been looking at lots of vcf files, and this behavior doesn't occur for every vcf file. Although I cannot find anything distinguishing about the vcf files that do result in negative numbers when read in with scikit-allel. The VCF files I'm looking at are all generated through the same pipeline, so differences would have to be with the original reads used to generate them.Thanks for your help!
And sorry for all the opening and closing of this issue. I accidentally opened the issue before I'd finished typing it all out :/