PapenfussLab / gridss

GRIDSS: the Genomic Rearrangement IDentification Software Suite
Other
258 stars 71 forks source link

bcftools failed with GRIDSS output #614

Closed lmanchon closed 1 year ago

lmanchon commented 1 year ago

- bcftools norm -m-both -f $GENOME/GRCh38_p13.fa -Oz -o gridss.fixed.vcf.gz gridss_output.vcf Non-ACGTN alternate allele at chr1:1222499 .. REF_SEQ:'(null)' vs VCF:']chr4:54278520]GC'

VCF from gridss is not well recognized by bcftools

d-cameron commented 1 year ago

VCF from gridss is not well recognized by bcftools

GRIDSS output adheres to the VCF (version 4.2) specifications - see section 5.4 (https://samtools.github.io/hts-specs/VCFv4.2.pdf). This is an issue with bcftools not supporting ALT alleles in breakpoint (and breakend - see section 5.4.9) notation so should be raised as a bug with bcftools.

d-cameron commented 1 year ago

Be aware that for inversion-like breakend in +/+ or -/- notation, normalisation/left-alignment doesn't make sense since left-aligning one side forces the other to the right-aligned. Additionally, imprecise variants are reported in the middle of the 95% confidence interval since that results in the least positional deviation from the possible positions.

For the above reasons, GRIDSS outputs centre-aligned variants. If you want to left-align, you can adjust the positions by the intervals specified in the HOMPOS or CIPOS INFO field (depending on whether you want to adjust just the precise variants with breakpoint microhomology or want to also shift IMPRECISE variants).