Selecting a "good" set of reads for the filling process is crucial to the success of the project as the most important goal is correctness.
Tasks
[ ] Reads in a pile up should have…
[x] a long “anchor” sequence, ie. presumable unique sequence
[x] little overall error in the local alignments
[ ] no improbable high local (per trace point) error rate
[x] ~no improbable high global error rate~ (implicit with --reads-error)
Confidence intervals of local error rate
Given these values
L ... #{reference bps covered by alignment chain}
t ... trace point distance
n ... either L if global or t if local
z ... parameter for the confidence interval; multiplier for σ
ε_reads ... error rate [#{errors}/base pair] of reads
ε_ref ... error rate [#{errors}/base pair] of reference
ε = (1 - ε_reads)(1 - ε_ref)
X_ε ... #{errors in n bps with ε #{errors}/base pair}
Selecting a "good" set of reads for the filling process is crucial to the success of the project as the most important goal is correctness.
Tasks
--reads-error
)Confidence intervals of local error rate
Given these values
we can approximate the distribution of
X_ε