sanger-pathogens / snp-sites

Finds SNP sites from a multi-FASTA alignment file
http://sanger-pathogens.github.io/snp-sites/
Other
241 stars 50 forks source link

What does *STAR* in ALT stand for #105

Open jielab opened 3 years ago

jielab commented 3 years ago

Hi, there:

I used snp-sites to generate a VCF file for ~2 million SARS-COV-2 genome FASTA file that I downloaded from GISAID.

Below is the first record: image

I also manually extracted this first record and run a tabulation and got the following: 343927 0 1682817 1 2 2 4 3 5 4 6 5 2 6 4 7

I think 0 is for the REF, while 1-7 are for the 7 ALT alleles (*, A, K, C, S, T, Y*) respectively. But I am not sure that is exactly the first ALT allele ? It has a count of 1682817**.

Thank you very much & best regadrs, Jie

pamelacamejom commented 2 years ago

Did you get an answer to this question?

ShriramHPatel commented 2 years ago

From what I know, * indicates gap at that position for that genome. Hope that helps.

Lewis-W-S-Fisher commented 11 months ago

I've notice that sometimes this is just low depth. Do you know what depth is classed as a gap?