hsinnan75 / GSAlign

GSAlign: an ultra-fast sequence alignment algorithm for intra-species genome comparison
MIT License
51 stars 16 forks source link

Multiple variants assigned to same reference position with different reference sequence #18

Open tintori opened 2 years ago

tintori commented 2 years ago

Hello,

I just tried using GSAlign and ran into something strange in the VCF that looks like a bug to me. For certain reference sites there are multiple variants listed that each have a difference reference base. For example:

contig_10_pilon 6241247 . A g 100 TYPE=SUBSTITUTE
contig_10_pilon 6241247 . T c 100
TYPE=SUBSTITUTE
contig_10_pilon 6241249 . A c 100 TYPE=SUBSTITUTE
contig_10_pilon 6241249 . T g 100
TYPE=SUBSTITUTE

I'd be happy to provide my original fasta files if you wish to pursue this and would find it useful to reproduce the error.

Thanks, Sophie