Closed MostafaYA closed 6 years ago
Hi sorry for the late response, the message was in the spam folder.
is that possible to replace the allele calls with the actual nucleotide in each of the samples?
see something like https://www.biostars.org/p/246796/
closing this one.
Hi, I am using the script
msa2vcf
to get a .vcf file from a multiple alignment file the tools works perfectly with me, but I want to create a SNP table from the produced VCF file my issue is: is that possible to replace the allele calls with the actual nucleotide in each of the samples?=Here is my example
$ cat alignment_file.aln
$msa2vcf alignment_file.aln [INFO][MsaToVcf]Reading from alignment_file.aln [INFO][MsaToVcf]format : Fasta
fileformat=VCFv4.2
FORMAT=
FORMAT=
INFO=
contig=
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample2 sample3
chrUn 4 . T A . . DP=3 GT:DP 1/1:1 0/0:1 0/0:1 chrUn 6 . G C . . DP=3 GT:DP 0/0:1 0/0:1 1/1:1 chrUn 13 . C G . . DP=3 GT:DP 1/1:1 0/0:1 0/0:1 [INFO][MsaToVcf]Done
My desired format is
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample2 sample3
chrUn 4 . T A . . DP=3 GT:DP A T T chrUn 6 . G C . . DP=3 GT:DP G G C chrUn 13 . C G . . DP=3 GT:DP G C C
=so I can proceed with that may be with awk commands to get the SNP table like this POS REF sample1 sample2 sample3 4 T A T T 6 G G G C 13 C G C C
Would also appreciate if you refer me to any further tool to proceed with.