freeseek / score

Tools to work with GWAS-VCF summary statistics files
MIT License
94 stars 6 forks source link

insertion became deletion after liftover #7

Open nzhun opened 2 months ago

nzhun commented 2 months ago

here is an example, an insertion was liftovered to a deletion. hg37 chr19:54754990_C/CGG, was liftovered to be: hg38 chr19:54251125:CGG/C

freeseek commented 2 months ago

BCFtools/liftover correctly handles indels swaps between references, as opposed to Picard/LiftoverVcf or CrossMap/VCF who cannot do this. You can use the tool to observe this behavior. This is what the variant looks like in hg19:

$ echo -e "##fileformat=VCFv4.2\n##contig=<ID=chr19>\n#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\nchr19\t54754990\t.\tC\tCGG\t.\t.\t." | bcftools +liftover -- -c hg19ToHg38.over.chain.gz -s hg19.fa | bcftools view -H
chr19   54754990    .   CGGC    CGGGGC  .   .   .

In hg19 the reference allele is two G's and the alternate allele is four G's. When you liftover to hg38:

$ echo -e "##fileformat=VCFv4.2\n##contig=<ID=chr19>\n#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\nchr19\t54754990\t.\tC\tCGG\t.\t.\t." | bcftools +liftover -- -c hg19ToHg38.over.chain.gz -s hg19.fa -f hg38.fa --no-left-align | bcftools view -H
Lines   total/swapped/reference added/rejected: 1/1/0/0
chr19   54251125    .   CGGGGC  CGGC    .   .   SWAP=1

In hg38 the reference allele is four G's and the alternate allele is two G's so that what was before the alternate allele is now the reference allele, as it should be