Closed ffinfo closed 6 years ago
Something good to know. When using this vcf file it does work:
##fileformat=VCFv4.2
##INFO=<ID=BLA,Number=.,Type=String,Description="Depth">
##INFO=<ID=EMPTY,Number=.,Type=String,Description="Depth">
##FORMAT=<ID=GT,Number=.,Type=String,Description="Genotype">
##FORMAT=<ID=BLA,Number=.,Type=Integer,Description="Genotype">
##FORMAT=<ID=EMPTY,Number=.,Type=Integer,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample_1 Sample_2 Sample_3
chrQ 1 . A T . . BLA=1 GT:BLA 0/1:5 0/1:1 0/1:5
chrQ 2 . A T . . BLA=1,2,1 GT:BLA 0/1:5 0/0:5 0/0:5
chrQ 3 . A T . . BLA=2 GT:BLA 0/1:1 0/1:5 0/0:5
chrQ 4 . A T . . BLA=3 GT 0/0 0/0 0/0
This means somewhere in adam the DP must always be a single number while the vcf specs does not force this ;)
This means somewhere in adam the DP must always be a single number while the vcf specs does not force this ;)
Well, actually it does. DP is a reserved INFO field, and if you provide a definition in the ##INFO
header line for ID=DP
other than Number=1,Type=Integer
, it will get clobbered by htsjdk.
From VCF 4.2 spec
DP : combined depth across samples, e.g. DP=154
From VCF 4.3 draft spec
Table 1: Reserved INFO fields
Field | Number | Type | Description |
---|---|---|---|
DP | 1 | Integer | Combined depth across samples |
https://github.com/samtools/hts-specs/blob/master/VCFv4.3.tex#L350
We follow the 4.3 specification as close as possible. VCF INFO reserved key AD
for total read depth for each allele is Number=R,Type=Integer
, perhaps that better fits your use case.
That said, we should catch this on read and provide a useful error message.
Sigh, if only the htsjdk devs had ever heard of a logging framework
if ( needsRepair ) {
if ( GeneralUtils.DEBUG_MODE_ENABLED ) {
System.err.println("Repairing standard header line for field ...
}
return standard;
}
System.err.println, that's hot.
Was not yet aware of the 4.3 specs. I think it's a good thing to have this in the specs. But indeed a bit more explanation on the error would be nice indeed. I did get confused by error like you notice ;)
Just wanted to check in on this; is there any action needed on this, e.g. additional logging (either here or upstream) or more docs?
Will need a bit more digging to see if we can catch the issue before htsjdk silently "repairs" it for us.
I don't think there's more action on this; so I'm closing this as won't fix until htsjdk moves to VCF 4.3.
On gitter I did already mentioned this issue but now also including Code, vcf file and stacktrace.
For this I have used:
Code
Vcf file
Stacktrace