Simon-Coetzee / motifBreakR

A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites.
28 stars 12 forks source link

Software Redefines What an Indel is #28

Closed DarioS closed 4 years ago

DarioS commented 4 years ago

A DNP, such as CC > TT, is considered to be an indel:

is.indel <- ref_len > 1L | alt_len > 1L

It appears that the software considers anything that is not a SNP to be an indel. But DNP or TNP are not indels and I was confused at first to see:

ref.len <- nchar(fsnplist.indel[nchar(fsnplist.indel$REF) == nchar(fsnplist.indel$ALT)]$REF) # Indel variants same length as reference.

Could a better categorisation of variants be used, such as Simple Nucleotide Variants (SNV) and indels? Typically, the genomics abbreviations are:

SNP, single-nucleotide polymorphism; SNV, simple nucleotide variant

dennishazelett commented 4 years ago

Hi, computationally we treat indels and serial substitutions (or however you want to label them) the same for purposes of calculation.