I think this issue is related to something that has been mentioned previously here, but I'm not sure that it is resolved.
I am using GSalign to align and call variants from a series of assembled genomes from cultivars (individuals) within a species. I have used exactly the same reference genome file for each alignment. At some positions I am getting different REFERENCE bases in the vcf file (example below):
(ncbi_datasets) xxxx@server:~/pangenome/test$ grep 5535808 genome1.vcf
chr1 5535808 . T c 100 * TYPE=SUBSTITUTE
(ncbi_datasets) xxxx@server:~/pangenome/test$ grep 5535808 genome2.vcf
chr1 5535808 . A g 100 * TYPE=SUBSTITUTE
The correct ref allele is "A" in this case.
I am using GSAlign v1.0.22. Any help would be most gratefully accepted as I need to do a fair number of these alignments for a large genomes and GS align seems to be by far the fastest tool available for this.
Oh yes, one other thing, could you possibly fix the header exported so that the "*" in the filter field is recognized by bcftools (for the eventual VCF merge operation)?
I'm seeing the same issue with some bacterial genomes. It looks like a bug.
My guess is that GSAlign gets confused when contigs are aligned on their reverse-complements.
I think this issue is related to something that has been mentioned previously here, but I'm not sure that it is resolved.
I am using GSalign to align and call variants from a series of assembled genomes from cultivars (individuals) within a species. I have used exactly the same reference genome file for each alignment. At some positions I am getting different REFERENCE bases in the vcf file (example below):
(ncbi_datasets) xxxx@server:~/pangenome/test$ grep 5535808 genome1.vcf chr1 5535808 . T c 100 * TYPE=SUBSTITUTE
(ncbi_datasets) xxxx@server:~/pangenome/test$ grep 5535808 genome2.vcf chr1 5535808 . A g 100 * TYPE=SUBSTITUTE
The correct ref allele is "A" in this case.
I am using GSAlign v1.0.22. Any help would be most gratefully accepted as I need to do a fair number of these alignments for a large genomes and GS align seems to be by far the fastest tool available for this.
Oh yes, one other thing, could you possibly fix the header exported so that the "*" in the filter field is recognized by bcftools (for the eventual VCF merge operation)?
many thanks