Hi, @mnshgl0110
I'm using Syri to indentify snps and indels in chloroplast genomes.There is about 200 genomes.
I'm wondering if there is any way to merge vcfs from these genomes, such as g.vcf to vcf in gatk multi-sample pipeline.
for example,
VCF1
CHROM POS REF 534M_1
nip-l 15 T G
nip-l 16 C A
nip-l 412 T C
nip-l 4547 G T
nip-l 6283 T C
nip-l 6609 G T
nip-l 7135 T C
nip-l 8128 A G
nip-l 12498 A G
nip-l 12801 G A
VCF2
CHROM POS REF Gla4_1
nip-l 33463 C T
nip-l 34018 A G
nip-l 35384 G A
nip-l 47709 A C
nip-l 48529 C T
nip-l 49850 C T
nip-l 50244 A C
nip-l 51344 T A
nip-l 52142 C T
nip-l 53515 C T
What I want
CHROM POS REF 534M_1 Gla4_1
nip-l 15 T G .
nip-l 16 C A .
nip-l 115 T . .
nip-l 412 T C .
nip-l 448 A . .
nip-l 679 A . .
nip-l 817 A . .
nip-l 1227 C . .
nip-l 1494 C . .
The problem is that i can't figure out the '.' represent the NA value or the REF, for the non-var information is not provided in vcf.
Could you please give me some advices?
Thanks a lot.
I think, you can try using the syri/scripts/vcfasm script. It can filter out SNPs and indels from syri output vcf. Once you have the snp/indel vcf, then I think you can merge the vcfs easily.
Hi, @mnshgl0110 I'm using Syri to indentify snps and indels in chloroplast genomes.There is about 200 genomes. I'm wondering if there is any way to merge vcfs from these genomes, such as g.vcf to vcf in gatk multi-sample pipeline. for example, VCF1
CHROM POS REF 534M_1
nip-l 15 T G nip-l 16 C A nip-l 412 T C nip-l 4547 G T nip-l 6283 T C nip-l 6609 G T nip-l 7135 T C nip-l 8128 A G nip-l 12498 A G nip-l 12801 G A VCF2
CHROM POS REF Gla4_1
nip-l 33463 C T nip-l 34018 A G nip-l 35384 G A nip-l 47709 A C nip-l 48529 C T nip-l 49850 C T nip-l 50244 A C nip-l 51344 T A nip-l 52142 C T nip-l 53515 C T
What I want
CHROM POS REF 534M_1 Gla4_1
nip-l 15 T G . nip-l 16 C A . nip-l 115 T . . nip-l 412 T C . nip-l 448 A . . nip-l 679 A . . nip-l 817 A . . nip-l 1227 C . . nip-l 1494 C . .
The problem is that i can't figure out the '.' represent the NA value or the REF, for the non-var information is not provided in vcf. Could you please give me some advices? Thanks a lot.