AstraZeneca-NGS / VarDict

VarDict
MIT License
187 stars 62 forks source link

MSI is a float not an integer #116

Closed AlisonKennedy closed 5 years ago

AlisonKennedy commented 5 years ago

Hi,

I am running VarDict on targeted sequencing and have just come across an MSI value that is a float. What does it mean when the value given in MSI is a float? I interpreted this value as the number of similar bases that are strung together i,e AAAAA MSI=5 and MSILEN=1

Hope you can clarify this for me!

PolinaBevad commented 5 years ago

Hi Alison, In most cases, MSI is exactly what you mean. But for the case when we have alternative alignment in deletions and insertions (you can separate such case by shift3 field), we calculate MSI a little different, using this alternative alignment length, so it can be float when alternative alignment does not fully match with MSI sequence. But you can still consider the integer part of such float as the number of exact repetitions. More often it appears for complex MSI, like CGTAGCGTAG, for example.

PolinaBevad commented 5 years ago

Hi Alison, I think I can close the issue. If something is not clear, please open it again.