brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
MIT License
254 stars 35 forks source link

headers for unprefixed genome builds have chr prefix #57

Open rdmorin opened 4 years ago

rdmorin commented 4 years ago

Hi. I am not sure if this is a problem for Somalier but I noticed that both sites.hg38.nochr.vcf.gz and sites.GRCh37.vcf.gz have unprefixed chromosome names for all the variants but the header still contains chr prefixed contig names. For example:

zgrep contig= sites.GRCh37.vcf.gz | head
##contig=<ID=chr1,length=249250621>
##contig=<ID=chr10,length=135534747>
##contig=<ID=chr11,length=135006516>

This might cause some confusion or issues with other workflows even it is compatible with Somalier.