MGPC-Nantes / MEM

Motivation: Microsatellites are DNA sequences formed by a continuous repetition of patterns from 1 to 6 nucleotides. Deficient mismatch repair system (dMMR) induces a variation in length of microsatellites called microsatellite instability (MSI). However, MSI assessment by Next-Generation Sequencing (NGS) is difficult because replication errors occurring during the amplification steps of the sequencing process themselves induce variation in microsatellite sequence length. Results: The MSI assessment by Expectation-Maximization (MEM) algorithm attempts to closely replicate the reference PCR interpretation method for 5 microsatellites validated by the Bethesda and ESMO international guidelines (BAT-25, BAT-26, NR-21, NR-24, and NR-27). MEM identifies the stable or unstable nature of each microsatellite i- by determining the length distribution of microsatellite sequences from unmapped and quality unfiltered paired-end reads using the Smith-Waterman alignment, and ii- by determining whether the observed distribution is comparable to a reference distribution (stable) or corresponds to a mixture model, i.e. the mixture of several sub-distributions of different mean lengths (unstable) using Expectation-Maximization algorithm.
Other
2 stars 1 forks source link